{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Named Entity Recognition without labelled data: A weak supervision approach"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this walkthrough, we will develop a neural NER model without access to labelled data. \n",
    "\n",
    "**Important note**: some of the data/model files used in this walkthrough are too big to be put on the GitHub repository, but are accessible for download [here](https://github.com/NorskRegnesentral/skweak/releases/tag/0.2.8).\n",
    "\n",
    "Let's look at a particular example of text (from Reuters):\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import sys\n",
    "sys.path.insert(0, '../..')\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "import re\n",
    "news_text  = \"\"\"\n",
    "ATLANTA  (Reuters) - Retailer Best Buy Co, seeking new ways to appeal to cost-conscious shoppers, said on Tuesday it is selling refurbished \n",
    " versions of Apple Inc's iPhone 3G at its stores that are priced about $50 less than new iPhones. \n",
    " The electronics chain said the used iPhones, which were returned within 30 days of purchase, are priced at $149 for the model with 8 gigabytes of storage, \n",
    " while the 16-gigabyte version is $249. A two-year service contract with AT&T Inc is required. New iPhone 3Gs currently sell for $199 and $299 at \n",
    " Best Buy Mobile stores. \"This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,\" said \n",
    " Scott Moore, vice president of marketing for Best Buy Mobile. Buyers of first-generation iPhones can also upgrade to the faster refurbished 3G models at \n",
    " Best Buy, he said. Moore said AT&T, the exclusive wireless provider for the iPhone, offers refurbished iPhones online. The sale of used iPhones comes as \n",
    " Best Buy, the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as Wal-Mart Stores Inc, which began \n",
    " selling the popular phone late last month. Wal-Mart sells a new 8-gigabyte iPhone 3G for $197 and $297 for the 16-gigabyte model. The iPhone is also \n",
    " sold at Apple stores and AT&T stores. Moore said Best Buy's move was not in response to other retailers' actions. (Reporting by  Karen Jacobs ; Editing \n",
    " by  Andre Grenon )\"\"\"\n",
    "\n",
    "news_text = re.sub('\\\\s+', ' ', news_text)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's view what the standard Spacy model produces (note it takes a few seconds to reload the vocabulary):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> \n",
       "<mark class=\"entity\" style=\"background: #feca74; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    ATLANTA\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">GPE</span>\n",
       "</mark>\n",
       " (\n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Reuters\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ") - Retailer Best Buy Co, seeking new ways to appeal to cost-conscious shoppers, said on \n",
       "<mark class=\"entity\" style=\"background: #bfe1d9; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Tuesday\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">DATE</span>\n",
       "</mark>\n",
       " it is selling refurbished versions of \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple Inc's\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " iPhone 3G at its stores that are priced \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    about $50\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " less than new \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ". The electronics chain said the used \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", which were returned within \n",
       "<mark class=\"entity\" style=\"background: #bfe1d9; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    30 days\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">DATE</span>\n",
       "</mark>\n",
       " of purchase, are priced at $\n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    149\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " for the model with \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    8\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">CARDINAL</span>\n",
       "</mark>\n",
       " gigabytes of storage, while the \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    16-gigabyte\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">QUANTITY</span>\n",
       "</mark>\n",
       " version is $\n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    249\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       ". A \n",
       "<mark class=\"entity\" style=\"background: #bfe1d9; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    two-year\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">DATE</span>\n",
       "</mark>\n",
       " service contract with \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T Inc\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " is required. New iPhone 3Gs currently sell for $\n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    199\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " and $\n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    299\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy Mobile\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said \n",
       "<mark class=\"entity\" style=\"background: #aa9cfc; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Scott Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PERSON</span>\n",
       "</mark>\n",
       ", vice president of marketing for \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy Mobile\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ". Buyers of \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    first\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORDINAL</span>\n",
       "</mark>\n",
       "-generation \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " can also upgrade to the faster refurbished \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    3\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">CARDINAL</span>\n",
       "</mark>\n",
       "G models at \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">FAC</span>\n",
       "</mark>\n",
       ", he said. \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", the exclusive wireless provider for the \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", offers refurbished \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " online. The sale of used \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " comes as \n",
       "<mark class=\"entity\" style=\"background: #aa9cfc; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PERSON</span>\n",
       "</mark>\n",
       ", the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Wal-Mart Stores Inc\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", which began selling the popular phone \n",
       "<mark class=\"entity\" style=\"background: #bfe1d9; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    late last month\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">DATE</span>\n",
       "</mark>\n",
       ". \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Wal-Mart\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " sells a new \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    8-gigabyte\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">QUANTITY</span>\n",
       "</mark>\n",
       " \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone 3G\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " for $\n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    197\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " and $\n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    297\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " for the \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    16-gigabyte\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">QUANTITY</span>\n",
       "</mark>\n",
       " model. The \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " is also sold at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores and \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores. \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #aa9cfc; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy's\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PERSON</span>\n",
       "</mark>\n",
       " move was not in response to other retailers' actions. (Reporting by \n",
       "<mark class=\"entity\" style=\"background: #aa9cfc; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Karen Jacobs\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PERSON</span>\n",
       "</mark>\n",
       " ; Editing by \n",
       "<mark class=\"entity\" style=\"background: #aa9cfc; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Andre Grenon\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PERSON</span>\n",
       "</mark>\n",
       " )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import spacy, skweak\n",
    "\n",
    "# We load the spacy model\n",
    "nlp = spacy.load(\"en_core_web_sm\")\n",
    "doc = nlp(news_text)\n",
    "\n",
    "# Visualising the entities\n",
    "skweak.utils.display_entities(doc)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The medium-size model works better, but still contains quite a few errors and omissions:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> \n",
       "<mark class=\"entity\" style=\"background: #feca74; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    ATLANTA\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">GPE</span>\n",
       "</mark>\n",
       " (\n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Reuters\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ") - Retailer Best Buy Co, seeking new ways to appeal to cost-conscious shoppers, said on \n",
       "<mark class=\"entity\" style=\"background: #bfe1d9; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Tuesday\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">DATE</span>\n",
       "</mark>\n",
       " it is selling refurbished versions of \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple Inc's\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone 3G\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " at its stores that are priced \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    about $50\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " less than new \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ". The electronics chain said the used \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", which were returned within \n",
       "<mark class=\"entity\" style=\"background: #bfe1d9; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    30 days\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">DATE</span>\n",
       "</mark>\n",
       " of purchase, are priced at $\n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    149\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " for the model with \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    8 gigabytes\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">QUANTITY</span>\n",
       "</mark>\n",
       " of storage, while the \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    16-gigabyte\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">QUANTITY</span>\n",
       "</mark>\n",
       " version is $\n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    249\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       ". A \n",
       "<mark class=\"entity\" style=\"background: #bfe1d9; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    two-year\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">DATE</span>\n",
       "</mark>\n",
       " service contract with \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T Inc\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " is required. New iPhone 3Gs currently sell for $\n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    199\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " and $\n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    299\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy Mobile\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said \n",
       "<mark class=\"entity\" style=\"background: #aa9cfc; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Scott Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PERSON</span>\n",
       "</mark>\n",
       ", vice president of marketing for \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy Mobile\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ". Buyers of \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    first\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORDINAL</span>\n",
       "</mark>\n",
       "-generation iPhones can also upgrade to the faster refurbished \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    3\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">CARDINAL</span>\n",
       "</mark>\n",
       "G models at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", he said. \n",
       "<mark class=\"entity\" style=\"background: #aa9cfc; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PERSON</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", the exclusive wireless provider for the \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", offers refurbished iPhones online. The sale of used \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       " comes as \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Wal-Mart Stores Inc\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", which began selling the popular phone \n",
       "<mark class=\"entity\" style=\"background: #bfe1d9; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    late last month\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">DATE</span>\n",
       "</mark>\n",
       ". \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Wal-Mart\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " sells a new \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    8-gigabyte\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">QUANTITY</span>\n",
       "</mark>\n",
       " \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone 3G\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " for $\n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    197\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " and $\n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    297\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " for the \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    16-gigabyte\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">QUANTITY</span>\n",
       "</mark>\n",
       " model. The \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " is also sold at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores and \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores. \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " said Best Buy's move was not in response to other retailers' actions. (Reporting by \n",
       "<mark class=\"entity\" style=\"background: #aa9cfc; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Karen Jacobs\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PERSON</span>\n",
       "</mark>\n",
       " ; Editing by \n",
       "<mark class=\"entity\" style=\"background: #aa9cfc; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Andre Grenon\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PERSON</span>\n",
       "</mark>\n",
       " )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "\n",
    "# We load the spacy model (takes a few seconds)\n",
    "nlp = spacy.load(\"en_core_web_md\")\n",
    "doc = nlp(news_text)\n",
    "\n",
    "# Visualising the entities\n",
    "skweak.utils.display_entities(doc)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ideally, one would wish to train a better named entity recognition model, which is better tailored to the specific needs and linguistic patterns found in those articles. However, although we have large amounts of raw text data, we often do not have text data labelled with named entities for our domain. We therefore worked on an alternative approach based on __weak supervision__, combining several (noisy) supervision sources instead of a single \"gold standard\". "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Indeed, we do have access to several possible supervision sources, such as alternative NER models trained on other corpora, large lists of entity (companies, person names, geographical locations), shallow linguistic patterns, and document-level constraints. \n",
    "\n",
    "The key idea behind the proposed approach is thus to (1) use these supervision sources to automatically annotate news corpora, (2) estimate a label model (more precisely an HMM model) that unifies all these sources into a single one, and (3) learn a new NER model based on these unified labels. <br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "__Outline of this notebook__: I describe below the various annotation schemes that I developed.  I then explain how these various sources can be merged into a single source using the `skweak` framework. Finally, I detail the architecture behind the NER model."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## __Step 1:__ Annotations"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1) Annotators from other Spacy models"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A first source of automatic annotation comes from NER models trained on multiple, distinct corpora. I went through [available NE-labelled corpora](https://github.com/juand-r/entity-recognition-datasets) to search for datasets that could be used to train alternative models. I then trained Spacy models for all of them, and then conducted some experiments to assess their performance. At the end of the process, I ended up with four models:\n",
    "- The standard Spacy model for English (`en_core_web_md`), trained on Ontonotes v5\n",
    "- A model trained on [ConLL 2003](https://www.clips.uantwerpen.be/conll2003/ner/)\n",
    "- A model trained on the [Broad Twitter Corpus](https://github.com/GateNLP/broad_twitter_corpus)\n",
    "- A model trained on a corpus of [SEC filings](https://www.aclweb.org/anthology/U15-1010/).\n",
    "    \n",
    "Note there are differences between the entity labels of these models: while Ontonotes contains no less than [18 classes](https://spacy.io/api/annotation#named-entities), the other corpora only contain `PER(SON)`, `ORG`, `LOC` and `MISC`. Furthermore, the labels also do not match each other perfectly: while Ontonotes distinguishes between geopolitical locations (`GPE`) and \"natural\" locations (such as continents, seas etc., labelled as `LOC`), the three other models regroup all geographical entities as `LOC`. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can apply annotations from a Spacy model using the `ModelAnnotator` class."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    ATLANTA\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " (\n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Reuters\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ") - Retailer \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy Co\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", seeking new ways to appeal to cost-conscious shoppers, said on Tuesday it is selling refurbished versions of \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple Inc\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       "'s \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " 3G at its stores that are priced about $50 less than new \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ". The electronics chain said the used \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", which were returned within 30 days of purchase, are priced at $149 for the model with 8 gigabytes of storage, while the 16-gigabyte version is $249. A two-year service contract with \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T Inc\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " is required. \n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    New iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       " 3Gs currently sell for $199 and $299 at \n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy Mobile\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       " stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Scott Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       ", vice president of marketing for \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy Mobile\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ". Buyers of first-generation \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " can also upgrade to the faster refurbished 3G models at \n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       ", he said. \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       ", the exclusive wireless provider for the \n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       ", offers refurbished \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " online. The sale of used \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " comes as \n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       ", the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as \n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Wal\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       "-\n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Mart Stores Inc\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", which began selling the popular phone late last month. \n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Wal\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       "-\n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Mart\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " sells a new 8-gigabyte \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " 3G for $197 and $297 for the 16-gigabyte model. The \n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       " is also sold at \n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       " stores and AT&amp;T stores. \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       "'s move was not in response to other retailers' actions. (Reporting by \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Karen Jacobs\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       " ; Editing by \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Andre Grenon\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       " )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "annotator = skweak.spacy.ModelAnnotator(\"conll2003\", \"../../data/conll2003/\")\n",
    "\n",
    "doc = annotator(doc)\n",
    "skweak.utils.display_entities(doc, \"conll2003\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As we can see, the results are not perfect on this model either, but the errors are distinct from the ones made by the Ontonotes model. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The annotations are written in the `spans` of the Spacy document:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[(ATLANTA, 'ORG'),\n",
       " (Reuters, 'ORG'),\n",
       " (Best Buy Co, 'ORG'),\n",
       " (Apple Inc, 'ORG'),\n",
       " (iPhone, 'ORG'),\n",
       " (iPhones, 'ORG'),\n",
       " (iPhones, 'ORG'),\n",
       " (AT&T Inc, 'ORG'),\n",
       " (New iPhone, 'LOC'),\n",
       " (Best Buy Mobile, 'LOC'),\n",
       " (Scott Moore, 'PER'),\n",
       " (Best Buy Mobile, 'ORG'),\n",
       " (iPhones, 'ORG'),\n",
       " (Best Buy, 'LOC'),\n",
       " (Moore, 'PER'),\n",
       " (AT&T, 'PER'),\n",
       " (iPhone, 'LOC'),\n",
       " (iPhones, 'ORG'),\n",
       " (iPhones, 'ORG'),\n",
       " (Best Buy, 'LOC'),\n",
       " (Wal, 'LOC'),\n",
       " (Mart Stores Inc, 'ORG'),\n",
       " (Wal, 'LOC'),\n",
       " (Mart, 'ORG'),\n",
       " (iPhone, 'ORG'),\n",
       " (iPhone, 'LOC'),\n",
       " (Apple, 'LOC'),\n",
       " (Moore, 'PER'),\n",
       " (Best Buy, 'ORG'),\n",
       " (Karen Jacobs, 'PER'),\n",
       " (Andre Grenon, 'PER')]"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "[(ent, ent.label_) for ent in doc.spans[\"conll2003\"]]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Each `ModelAnnotator` adds two annotation sources: one that is directly based on the Spacy Model (here `conll2003`), and one that also includes the corrections specified in the method `_correct_entities` (in `spacy_wrapper.py`) that we implemented earlier this year.  The corrected version are indicated with a `+c` suffix."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here are the results from the three other models:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> \n",
       "<mark class=\"entity\" style=\"background: #feca74; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    ATLANTA\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">GPE</span>\n",
       "</mark>\n",
       " (\n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Reuters\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ") - Retailer Best Buy Co, seeking new ways to appeal to cost-conscious shoppers, said on \n",
       "<mark class=\"entity\" style=\"background: #bfe1d9; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Tuesday\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">DATE</span>\n",
       "</mark>\n",
       " it is selling refurbished versions of \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple Inc's\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone 3G\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " at its stores that are priced \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    about $50\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " less than new \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ". The electronics chain said the used \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", which were returned within \n",
       "<mark class=\"entity\" style=\"background: #bfe1d9; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    30 days\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">DATE</span>\n",
       "</mark>\n",
       " of purchase, are priced at $\n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    149\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " for the model with \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    8 gigabytes\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">QUANTITY</span>\n",
       "</mark>\n",
       " of storage, while the \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    16-gigabyte\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">QUANTITY</span>\n",
       "</mark>\n",
       " version is $\n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    249\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       ". A \n",
       "<mark class=\"entity\" style=\"background: #bfe1d9; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    two-year\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">DATE</span>\n",
       "</mark>\n",
       " service contract with \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T Inc\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " is required. New iPhone 3Gs currently sell for $\n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    199\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " and $\n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    299\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy Mobile\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said \n",
       "<mark class=\"entity\" style=\"background: #aa9cfc; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Scott Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PERSON</span>\n",
       "</mark>\n",
       ", vice president of marketing for \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy Mobile\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ". Buyers of \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    first\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORDINAL</span>\n",
       "</mark>\n",
       "-generation iPhones can also upgrade to the faster refurbished \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    3\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">CARDINAL</span>\n",
       "</mark>\n",
       "G models at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", he said. \n",
       "<mark class=\"entity\" style=\"background: #aa9cfc; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PERSON</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", the exclusive wireless provider for the \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", offers refurbished iPhones online. The sale of used \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       " comes as \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Wal-Mart Stores Inc\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", which began selling the popular phone \n",
       "<mark class=\"entity\" style=\"background: #bfe1d9; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    late last month\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">DATE</span>\n",
       "</mark>\n",
       ". \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Wal-Mart\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " sells a new \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    8-gigabyte\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">QUANTITY</span>\n",
       "</mark>\n",
       " \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone 3G\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " for $\n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    197\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " and $\n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    297\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " for the \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    16-gigabyte\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">QUANTITY</span>\n",
       "</mark>\n",
       " model. The \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " is also sold at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores and \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores. \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " said Best Buy's move was not in response to other retailers' actions. (Reporting by \n",
       "<mark class=\"entity\" style=\"background: #aa9cfc; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Karen Jacobs\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PERSON</span>\n",
       "</mark>\n",
       " ; Editing by \n",
       "<mark class=\"entity\" style=\"background: #aa9cfc; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Andre Grenon\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PERSON</span>\n",
       "</mark>\n",
       " )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "core_web_annotator = skweak.spacy.ModelAnnotator(\"core_web_md\", \"en_core_web_md\")\n",
    "\n",
    "doc = core_web_annotator(doc)\n",
    "skweak.utils.display_entities(doc, \"core_web_md\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "__Note__: When annotating large collections of news documents, the method `annotator.pipe(news_docs)` is much more efficient than calling `annotate(...)` every single time, as it batches the documents on which to run the NER model."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2) Annotators from gazetteers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Another useful source of annotation comes from large lists of entities such as persons, places and organisations. The gazetteers are using a _trie_ to efficiently search for occurrences in the text. Gazetteers can be run in two modes: case-sensitive or case-insentitive."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "#### 2.1) Wikipedia\n",
    "The database from Wikipedia is extracted from the [NECKar](https://event.ifi.uni-heidelberg.de/?page_id=532) dataset.  The postprocessing (which, among others, filters out entities that are also relatively common English words) is implemented in `compile_wikidata` as [WIKIDATA](https://github.com/NorskRegnesentral/skweak/releases/download/0.2.8/wikidata_tokenised.json.gz). In addition, I also extracted from Wikidata a list of commercial products and added them to the gazetteer. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Extracting data from ../../data/wikidata_tokenised.json\n",
      "Populating trie for class PERSON (number: 2621131)\n",
      "Populating trie for class LOC (number: 47104)\n",
      "Populating trie for class GPE (number: 601419)\n",
      "Populating trie for class ORG (number: 295449)\n",
      "Populating trie for class PRODUCT (number: 12457)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> ATLANTA (\n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Reuters\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ") - Retailer Best Buy Co, seeking new ways to appeal to cost-conscious shoppers, said on Tuesday it is selling refurbished versions of Apple Inc's \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone 3G\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       " at its stores that are priced about $50 less than new \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       ". The electronics chain said the used \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       ", which were returned within 30 days of purchase, are priced at $149 for the model with 8 gigabytes of storage, while the 16-gigabyte version is $249. A two-year service contract with AT&amp;T Inc is required. New iPhone 3Gs currently sell for $199 and $299 at Best Buy Mobile stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said \n",
       "<mark class=\"entity\" style=\"background: #aa9cfc; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Scott Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PERSON</span>\n",
       "</mark>\n",
       ", vice president of marketing for Best Buy Mobile. Buyers of first-generation iPhones can also upgrade to the faster refurbished 3G models at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", he said. \n",
       "<mark class=\"entity\" style=\"background: #feca74; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">GPE</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">GPE+ORG</span>\n",
       "</mark>\n",
       ", the exclusive wireless provider for the \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       ", offers refurbished \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       " online. The sale of used \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       " comes as \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as Wal-Mart Stores Inc, which began selling the popular phone late last month. Wal-Mart sells a new 8-gigabyte \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone 3G\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       " for $197 and $297 for the 16-gigabyte model. The \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       " is also sold at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores and \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">GPE+ORG</span>\n",
       "</mark>\n",
       " stores. \n",
       "<mark class=\"entity\" style=\"background: #feca74; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">GPE</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       "'s move was not in response to other retailers' actions. (Reporting by Karen Jacobs ; Editing by Andre Grenon )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import examples.ner.conll2003_ner\n",
    "\n",
    "tries = skweak.gazetteers.extract_json_data(\"../../data/wikidata_tokenised.json\")\n",
    "annotator = skweak.gazetteers.GazetteerAnnotator(\"wiki\", tries)\n",
    "\n",
    "annotator(doc)\n",
    "skweak.utils.display_entities(doc, \"wiki\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Again, the annotation model does make some errors: `Moore` is thought to be a [geopolitical entity](https://en.wikipedia.org/wiki/Moore) instead of a person. Note that `AT&T` has two alternative labels: `ORG` or `GPE` (see [AT&T station](https://en.wikipedia.org/wiki/AT%26T_(SEPTA_station))). The data is available as [WIKI_SMALL](https://github.com/NorskRegnesentral/skweak/raw/main/data/wikidata_small_tokenised.json.gz).\n",
    "\n",
    "In addition to the full wiki data, I also added a specific gazetteer that only employs wikidata objects containing a text description:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Extracting data from ../../data/wikidata_small_tokenised.json\n",
      "Populating trie for class PERSON (number: 1863434)\n",
      "Populating trie for class LOC (number: 14241)\n",
      "Populating trie for class GPE (number: 273373)\n",
      "Populating trie for class ORG (number: 91341)\n",
      "Populating trie for class PRODUCT (number: 12457)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> ATLANTA (\n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Reuters\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ") - Retailer Best Buy Co, seeking new ways to appeal to cost-conscious shoppers, said on Tuesday it is selling refurbished versions of Apple Inc's \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone 3G\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       " at its stores that are priced about $50 less than new \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       ". The electronics chain said the used \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       ", which were returned within 30 days of purchase, are priced at $149 for the model with 8 gigabytes of storage, while the 16-gigabyte version is $249. A two-year service contract with AT&amp;T Inc is required. New iPhone 3Gs currently sell for $199 and $299 at Best Buy Mobile stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said \n",
       "<mark class=\"entity\" style=\"background: #aa9cfc; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Scott Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PERSON</span>\n",
       "</mark>\n",
       ", vice president of marketing for Best Buy Mobile. Buyers of first-generation iPhones can also upgrade to the faster refurbished 3G models at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", he said. \n",
       "<mark class=\"entity\" style=\"background: #feca74; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">GPE</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", the exclusive wireless provider for the \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       ", offers refurbished \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       " online. The sale of used \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       " comes as \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as Wal-Mart Stores Inc, which began selling the popular phone late last month. Wal-Mart sells a new 8-gigabyte \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone 3G\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       " for $197 and $297 for the 16-gigabyte model. The \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       " is also sold at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores and \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores. \n",
       "<mark class=\"entity\" style=\"background: #feca74; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">GPE</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       "'s move was not in response to other retailers' actions. (Reporting by Karen Jacobs ; Editing by Andre Grenon )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> \n",
       "<mark class=\"entity\" style=\"background: #feca74; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    ATLANTA\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">GPE</span>\n",
       "</mark>\n",
       " (\n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Reuters\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ") - Retailer Best Buy Co, seeking new ways to appeal to cost-conscious shoppers, said on Tuesday it is selling refurbished versions of Apple Inc's \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone 3G\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       " at its stores that are priced about $50 less than new \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       ". The electronics chain said the used \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       ", which were returned within 30 days of purchase, are priced at $149 for the model with 8 gigabytes of storage, while the 16-gigabyte version is $249. A two-year service contract with AT&amp;T Inc is required. New iPhone 3Gs currently sell for $199 and $299 at Best Buy Mobile stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said \n",
       "<mark class=\"entity\" style=\"background: #aa9cfc; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Scott Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PERSON</span>\n",
       "</mark>\n",
       ", vice president of marketing for Best Buy Mobile. Buyers of first-generation iPhones can also upgrade to the faster refurbished 3G models at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", he said. \n",
       "<mark class=\"entity\" style=\"background: #feca74; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">GPE</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", the exclusive wireless provider for the \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       ", offers refurbished \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       " online. The sale of used \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       " comes as \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as Wal-Mart Stores Inc, which began selling the popular phone late last month. Wal-Mart sells a new 8-gigabyte \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone 3G\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       " for $197 and $297 for the 16-gigabyte model. The \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       " is also sold at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores and \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores. \n",
       "<mark class=\"entity\" style=\"background: #feca74; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">GPE</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       "'s move was not in response to other retailers' actions. (Reporting by Karen Jacobs ; Editing by Andre Grenon )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "tries = skweak.gazetteers.extract_json_data(\"../../data/wikidata_small_tokenised.json\")\n",
    "annotator = skweak.gazetteers.GazetteerAnnotator(\"wikismall_cased\", tries)\n",
    "annotator2 = skweak.gazetteers.GazetteerAnnotator(\"wikismall_uncased\", tries, case_sensitive=False)\n",
    "\n",
    "annotator2(annotator(doc))\n",
    "skweak.utils.display_entities(doc, \"wikismall_cased\")\n",
    "print()\n",
    "skweak.utils.display_entities(doc, \"wikismall_uncased\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As we can see, the \"cased\" gazetteers have a higher precision than the uncased gazetteers (at a cost of lower coverage)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 2.2 Crunchbase"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The second gazetteer [Crunchbase](https://github.com/NorskRegnesentral/skweak/raw/main/data/crunchbase_companies.json.gz) is extracted from the [Open Data Map from Crunchbase](https://data.crunchbase.com/docs/open-data-map), which contains lists of both organisations and (business) persons."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Extracting data from ../../data/crunchbase.json\n",
      "Populating trie for class COMPANY (number: 788714)\n",
      "Populating trie for class ORG (number: 261)\n",
      "Populating trie for class PERSON (number: 1062669)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> ATLANTA (\n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Reuters\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">COMPANY</span>\n",
       "</mark>\n",
       ") - Retailer Best Buy Co, seeking new ways to appeal to cost-conscious shoppers, said on Tuesday it is selling refurbished versions of Apple Inc's iPhone 3G at its stores that are priced about $\n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    50\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">COMPANY</span>\n",
       "</mark>\n",
       " less than new iPhones. The electronics chain said the used iPhones, which were returned within 30 days of purchase, are priced at $149 for the model with 8 gigabytes of storage, while the 16-gigabyte version is $249. A two-year service contract with AT&amp;T Inc is required. New iPhone 3Gs currently sell for $199 and $299 at Best Buy Mobile stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said \n",
       "<mark class=\"entity\" style=\"background: #aa9cfc; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Scott Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PERSON</span>\n",
       "</mark>\n",
       ", vice president of marketing for Best Buy Mobile. Buyers of first-generation iPhones can also upgrade to the faster refurbished 3G models at \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">COMPANY</span>\n",
       "</mark>\n",
       ", he said. \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">COMPANY</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">COMPANY</span>\n",
       "</mark>\n",
       ", the exclusive wireless provider for the iPhone, offers refurbished iPhones online. The sale of used iPhones comes as \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">COMPANY</span>\n",
       "</mark>\n",
       ", the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as Wal-Mart Stores Inc, which began selling the popular phone late last month. Wal-Mart sells a new 8-gigabyte iPhone 3G for $197 and $297 for the 16-gigabyte model. The iPhone is also sold at \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">COMPANY</span>\n",
       "</mark>\n",
       " stores and \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">COMPANY</span>\n",
       "</mark>\n",
       " stores. \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">COMPANY</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">COMPANY</span>\n",
       "</mark>\n",
       "'s move was not in response to other retailers' actions. (Reporting by \n",
       "<mark class=\"entity\" style=\"background: #aa9cfc; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Karen Jacobs\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PERSON</span>\n",
       "</mark>\n",
       " ; Editing by Andre Grenon )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "tries = skweak.gazetteers.extract_json_data(\"../../data/crunchbase.json\",  spacy_model=\"en_core_web_sm\")\n",
    "annotator = skweak.gazetteers.GazetteerAnnotator(\"crunchbase_cased\", tries)\n",
    "annotator2 = skweak.gazetteers.GazetteerAnnotator(\"crunchbase_uncased\", tries)\n",
    "\n",
    "annotator2(annotator(doc))\n",
    "skweak.utils.display_entities(doc, [\"crunchbase_cased\", \"crunchbase_uncased\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 2.3 Geonames\n",
    "\n",
    "The [geonames](http:www.geonames.org) database [GEO_NAMES](https://github.com/NorskRegnesentral/skweak/blob/main/data/geonames.json) contains a large list of locations, including both geopolitical entities and \"natural\" locations:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Extracting data from ../../data/geonames.json\n",
      "Populating trie for class GPE (number: 15205)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> \n",
       "<mark class=\"entity\" style=\"background: #feca74; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    ATLANTA\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">GPE</span>\n",
       "</mark>\n",
       " (Reuters) - Retailer Best Buy Co, seeking new ways to appeal to cost-conscious shoppers, said on Tuesday it is selling refurbished versions of Apple Inc's iPhone 3G at its stores that are priced about $50 less than new iPhones. The electronics chain said the used iPhones, which were returned within 30 days of purchase, are priced at $149 for the model with 8 gigabytes of storage, while the 16-gigabyte version is $249. A two-year service contract with AT&amp;T Inc is required. New iPhone 3Gs currently sell for $199 and $299 at Best Buy Mobile stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said Scott Moore, vice president of marketing for Best Buy Mobile. Buyers of first-generation iPhones can also upgrade to the faster refurbished 3G models at Best Buy, he said. Moore said AT&amp;T, the exclusive wireless provider for the iPhone, offers refurbished iPhones online. The sale of used iPhones comes as Best Buy, the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as Wal-Mart Stores Inc, which began selling the popular phone late last month. Wal-Mart sells a new 8-gigabyte iPhone 3G for $197 and $297 for the 16-gigabyte model. The iPhone is also sold at Apple stores and AT&amp;T stores. Moore said Best Buy's move was not in response to other retailers' actions. (Reporting by Karen Jacobs ; Editing by Andre Grenon )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "tries = skweak.gazetteers.extract_json_data(\"../../data/geonames.json\",  spacy_model=\"en_core_web_sm\")\n",
    "annotator = skweak.gazetteers.GazetteerAnnotator(\"geo_cased\", tries)\n",
    "annotator2 = skweak.gazetteers.GazetteerAnnotator(\"geo_uncased\", tries, case_sensitive=False)\n",
    "\n",
    "annotator2(annotator(doc))\n",
    "skweak.utils.display_entities(doc, [\"geo_cased\", \"geo_uncased\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 2.4 Product names\n",
    "\n",
    "Finally, I used [DBPedia](http://www.dbpedia.org) to extract a list of products and brands as [Products](https://github.com/NorskRegnesentral/skweak/blob/main/data/products.json), since the recognition of products is particularly poor in Spacy NER models:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Extracting data from ../../data/products.json\n",
      "Populating trie for class PRODUCT (number: 45362)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> ATLANTA (Reuters) - Retailer Best Buy Co, seeking new ways to appeal to cost-conscious shoppers, said on Tuesday it is selling refurbished versions of Apple Inc's \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone 3G\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       " at its stores that are priced about $50 less than new \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       ". The electronics chain said the used \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       ", which were returned within 30 days of purchase, are priced at $149 for the model with 8 gigabytes of storage, while the 16-gigabyte version is $249. A two-year service contract with AT&amp;T Inc is required. New iPhone 3Gs currently sell for $199 and $299 at Best Buy Mobile stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said Scott Moore, vice president of marketing for Best Buy Mobile. Buyers of first-generation iPhones can also upgrade to the faster refurbished 3G models at Best Buy, he said. Moore said AT&amp;T, the exclusive wireless provider for the \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       ", offers refurbished \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       " online. The sale of used \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       " comes as Best Buy, the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as Wal-Mart Stores Inc, which began selling the popular phone late last month. Wal-Mart sells a new 8-gigabyte \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone 3G\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       " for $197 and $297 for the 16-gigabyte model. The \n",
       "<mark class=\"entity\" style=\"background: #bfeeb7; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PRODUCT</span>\n",
       "</mark>\n",
       " is also sold at Apple stores and AT&amp;T stores. Moore said Best Buy's move was not in response to other retailers' actions. (Reporting by Karen Jacobs ; Editing by Andre Grenon )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "\n",
    "tries = skweak.gazetteers.extract_json_data(\"../../data/products.json\",  spacy_model=\"en_core_web_sm\")\n",
    "annotator = skweak.gazetteers.GazetteerAnnotator(\"products_cased\", tries)\n",
    "annotator2 = skweak.gazetteers.GazetteerAnnotator(\"products_uncased\", tries)\n",
    "\n",
    "annotator2(annotator(doc))\n",
    "skweak.utils.display_entities(doc, [\"products_cased\", \"products_uncased\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3. Shallow patterns\n",
    "\n",
    "Some named entities can also be captured through relatively simple, handcrafted patterns defined on the Spacy document. The class `FunctionAnnotator` makes it easy to define an annotator based on a function that takes a Spacy document as input and generate text spans with a label. Relations of mutual exclusivity between annotation sources can also be specified in the annotator. For instance, we can specify that numbers that are part of a date, time or money span should be ignored from the \"number_detector\" (to avoid having e.g. the `21` in `October 21` labelled as a `CARDINAL`): "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> ATLANTA (Reuters) - Retailer Best Buy Co, seeking new ways to appeal to cost-conscious shoppers, said on \n",
       "<mark class=\"entity\" style=\"background: #bfe1d9; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Tuesday\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">DATE</span>\n",
       "</mark>\n",
       " it is selling refurbished versions of Apple Inc's iPhone 3G at its stores that are priced about $50 less than new iPhones. The electronics chain said the used iPhones, which were returned within 30 days of purchase, are priced at $149 for the model with 8 gigabytes of storage, while the 16-gigabyte version is $249. A two-year service contract with AT&amp;T Inc is required. New iPhone 3Gs currently sell for $199 and $299 at Best Buy Mobile stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said Scott Moore, vice president of marketing for Best Buy Mobile. Buyers of first-generation iPhones can also upgrade to the faster refurbished 3G models at Best Buy, he said. Moore said AT&amp;T, the exclusive wireless provider for the iPhone, offers refurbished iPhones online. The sale of used iPhones comes as Best Buy, the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as Wal-Mart Stores Inc, which began selling the popular phone late last month. Wal-Mart sells a new 8-gigabyte iPhone 3G for $197 and $297 for the 16-gigabyte model. The iPhone is also sold at Apple stores and AT&amp;T stores. Moore said Best Buy's move was not in response to other retailers' actions. (Reporting by Karen Jacobs ; Editing by Andre Grenon )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> ATLANTA (Reuters) - Retailer Best Buy Co, seeking new ways to appeal to cost-conscious shoppers, said on Tuesday it is selling refurbished versions of Apple Inc's iPhone 3G at its stores that are priced about $50 less than new iPhones. The electronics chain said the used iPhones, which were returned within 30 days of purchase, are priced at $149 for the model with 8 gigabytes of storage, while the 16-gigabyte version is $249. A two-year service contract with AT&amp;T Inc is required. New iPhone 3Gs currently sell for $199 and $299 at Best Buy Mobile stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said Scott Moore, vice president of marketing for Best Buy Mobile. Buyers of first-generation iPhones can also upgrade to the faster refurbished 3G models at Best Buy, he said. Moore said AT&amp;T, the exclusive wireless provider for the iPhone, offers refurbished iPhones online. The sale of used iPhones comes as Best Buy, the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as Wal-Mart Stores Inc, which began selling the popular phone late last month. Wal-Mart sells a new 8-gigabyte iPhone 3G for $197 and $297 for the 16-gigabyte model. The iPhone is also sold at Apple stores and AT&amp;T stores. Moore said Best Buy's move was not in response to other retailers' actions. (Reporting by Karen Jacobs ; Editing by Andre Grenon )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> ATLANTA (Reuters) - Retailer Best Buy Co, seeking new ways to appeal to cost-conscious shoppers, said on Tuesday it is selling refurbished versions of Apple Inc's iPhone 3G at its stores that are priced about \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    $50\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " less than new iPhones. The electronics chain said the used iPhones, which were returned within 30 days of purchase, are priced at \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    $149\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " for the model with 8 gigabytes of storage, while the 16-gigabyte version is \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    $249\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       ". A two-year service contract with AT&amp;T Inc is required. New iPhone 3Gs currently sell for \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    $199\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " and \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    $299\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " at Best Buy Mobile stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said Scott Moore, vice president of marketing for Best Buy Mobile. Buyers of first-generation iPhones can also upgrade to the faster refurbished 3G models at Best Buy, he said. Moore said AT&amp;T, the exclusive wireless provider for the iPhone, offers refurbished iPhones online. The sale of used iPhones comes as Best Buy, the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as Wal-Mart Stores Inc, which began selling the popular phone late last month. Wal-Mart sells a new 8-gigabyte iPhone 3G for \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    $197\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " and \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    $297\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " for the 16-gigabyte model. The iPhone is also sold at Apple stores and AT&amp;T stores. Moore said Best Buy's move was not in response to other retailers' actions. (Reporting by Karen Jacobs ; Editing by Andre Grenon )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> ATLANTA (Reuters) - Retailer Best Buy Co, seeking new ways to appeal to cost-conscious shoppers, said on Tuesday it is selling refurbished versions of Apple Inc's iPhone \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    3G\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">QUANTITY</span>\n",
       "</mark>\n",
       " at its stores that are priced about $50 less than new iPhones. The electronics chain said the used iPhones, which were returned within \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    30\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">CARDINAL</span>\n",
       "</mark>\n",
       " days of purchase, are priced at $149 for the model with \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    8 gigabytes\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">QUANTITY</span>\n",
       "</mark>\n",
       " of storage, while the \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    16\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">CARDINAL</span>\n",
       "</mark>\n",
       "-gigabyte version is $249. A two-year service contract with AT&amp;T Inc is required. New iPhone \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    3Gs\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">CARDINAL</span>\n",
       "</mark>\n",
       " currently sell for $199 and $299 at Best Buy Mobile stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said Scott Moore, vice president of marketing for Best Buy Mobile. Buyers of first-generation iPhones can also upgrade to the faster refurbished \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    3G\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">QUANTITY</span>\n",
       "</mark>\n",
       " models at Best Buy, he said. Moore said AT&amp;T, the exclusive wireless provider for the iPhone, offers refurbished iPhones online. The sale of used iPhones comes as Best Buy, the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as Wal-Mart Stores Inc, which began selling the popular phone late last month. Wal-Mart sells a new \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    8\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">CARDINAL</span>\n",
       "</mark>\n",
       "-gigabyte iPhone \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    3G\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">QUANTITY</span>\n",
       "</mark>\n",
       " for $197 and $297 for the \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    16\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">CARDINAL</span>\n",
       "</mark>\n",
       "-gigabyte model. The iPhone is also sold at Apple stores and AT&amp;T stores. Moore said Best Buy's move was not in response to other retailers' actions. (Reporting by Karen Jacobs ; Editing by Andre Grenon )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import examples.ner\n",
    "date_annotator = skweak.heuristics.FunctionAnnotator(\"date_detector\", examples.ner.conll2003_ner.date_generator)\n",
    "time_annotator = skweak.heuristics.FunctionAnnotator(\"time_detector\", examples.ner.conll2003_ner.time_generator)\n",
    "money_annotator = skweak.heuristics.FunctionAnnotator(\"money_detector\", examples.ner.conll2003_ner.money_generator)\n",
    "exclusives = [\"date_detector\", \"time_detector\", \"money_detector\"]\n",
    "number_annotator = skweak.heuristics.FunctionAnnotator(\"number_detector\", examples.ner.conll2003_ner.number_generator)\n",
    "number_annotator.add_incompatible_sources(exclusives)\n",
    "\n",
    "date_annotator(doc)\n",
    "time_annotator(doc)\n",
    "money_annotator(doc)\n",
    "number_annotator(doc)\n",
    "skweak.utils.display_entities(doc, \"date_detector\")\n",
    "skweak.utils.display_entities(doc, \"time_detector\")\n",
    "skweak.utils.display_entities(doc, \"money_detector\")\n",
    "skweak.utils.display_entities(doc, \"number_detector\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "I have also created a range of patterns aiming to improve the _detection_ of named entities, even though they leave the actual label underspecified (as a generic `ENT` label). Four such detectors are constructed:\n",
    "- two detectors of proper names based on casing (marking sequence of tokens whose lemma are \"titled\" as potential named entities)\n",
    "- one detector of NNP sequences (based on the Spacy POS tagger)\n",
    "- and one detector of sequences with proper names linked with \"compound\" dependency relations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    ATLANTA\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " (\n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Reuters\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ") - \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Retailer Best Buy Co\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", seeking new ways to appeal to cost-conscious shoppers, said on Tuesday it is selling refurbished versions of \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple Inc's iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " 3G at its stores that are priced about $50 less than new \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ". The electronics chain said the used \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", which were returned within 30 days of purchase, are priced at $149 for the model with 8 gigabytes of storage, while the 16-gigabyte version is $249. A two-year service contract with \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T Inc\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " is required. \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    New iPhone 3Gs\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " currently sell for $199 and $299 at \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy Mobile\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Scott Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", vice president of marketing for \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy Mobile\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ". Buyers of first-generation \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " can also upgrade to the faster refurbished 3G models at \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", he said. \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", the exclusive wireless provider for the \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", offers refurbished \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " online. The sale of used \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " comes as \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Wal-Mart Stores Inc\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", which began selling the popular phone late last month. \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Wal-Mart\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " sells a new 8-gigabyte \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " 3G for $197 and $297 for the 16-gigabyte model. The \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " is also sold at \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " stores and \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " stores. \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       "'s move was not in response to other retailers' actions. (\n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Reporting\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " by \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Karen Jacobs\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " ; Editing by \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Andre Grenon\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    ATLANTA\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " (\n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Reuters\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ") - \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Retailer Best Buy Co\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", seeking new ways to appeal to cost-conscious shoppers, said on Tuesday it is selling refurbished versions of \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple Inc's iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " 3G at its stores that are priced about $50 less than new \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ". The electronics chain said the used \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", which were returned within 30 days of purchase, are priced at $149 for the model with 8 gigabytes of storage, while the 16-gigabyte version is $249. A two-year service contract with \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T Inc\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " is required. \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    New iPhone 3Gs\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " currently sell for $199 and $299 at \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy Mobile\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Scott Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", vice president of marketing for \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy Mobile\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ". Buyers of first-generation \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " can also upgrade to the faster refurbished 3G models at \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", he said. \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", the exclusive wireless provider for the \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", offers refurbished \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " online. The sale of used \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " comes as \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Wal-Mart Stores Inc\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", which began selling the popular phone late last month. \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Wal-Mart\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " sells a new 8-gigabyte \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " 3G for $197 and $297 for the 16-gigabyte model. The \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " is also sold at \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " stores and \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " stores. \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       "'s move was not in response to other retailers' actions. (\n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Reporting by Karen Jacobs\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " ; Editing by \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Andre Grenon\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    ATLANTA\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " (\n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Reuters\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ") - \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Retailer Best Buy Co\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", seeking new ways to appeal to cost-conscious shoppers, said on Tuesday it is selling refurbished versions of \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple Inc's iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " 3G at its stores that are priced about $50 less than new iPhones. The electronics chain said the used iPhones, which were returned within 30 days of purchase, are priced at $149 for the model with 8 gigabytes of storage, while the 16-gigabyte version is $249. A two-year service contract with \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T Inc\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " is required. \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    New iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " 3Gs currently sell for $199 and $299 at \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy Mobile\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Scott Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", vice president of marketing for \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy Mobile\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ". Buyers of first-generation iPhones can also upgrade to the faster refurbished 3G models at \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", he said. \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", the exclusive wireless provider for the \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", offers refurbished iPhones online. The sale of used iPhones comes as \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Wal-Mart\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " Stores \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Inc\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", which began selling the popular phone late last month. \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Wal-Mart\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " sells a new 8-gigabyte \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " 3G for $197 and $297 for the 16-gigabyte model. The \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " is also sold at \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " stores and \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " stores. \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       "'s move was not in response to other retailers' actions. (Reporting by \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Karen Jacobs\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " ; Editing by \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Andre Grenon\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> ATLANTA (Reuters) - Retailer \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy Co\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", seeking new ways to appeal to cost-conscious shoppers, said on Tuesday it is selling refurbished versions of \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple Inc\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       "'s iPhone 3G at its stores that are priced about $50 less than new iPhones. The electronics chain said the used iPhones, which were returned within 30 days of purchase, are priced at $149 for the model with 8 gigabytes of storage, while the 16-gigabyte version is $249. A two-year service contract with \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T Inc\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " is required. \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    New iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " 3Gs currently sell for $199 and $299 at \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy Mobile\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Scott Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", vice president of marketing for \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy Mobile\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ". Buyers of first-generation \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " can also upgrade to the faster refurbished 3G models at \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", he said. Moore said AT&amp;T, the exclusive wireless provider for the iPhone, offers refurbished iPhones online. The sale of used iPhones comes as \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Wal-Mart Stores Inc\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       ", which began selling the popular phone late last month. \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Wal\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       "-Mart sells a new 8-gigabyte iPhone 3G for $197 and $297 for the 16-gigabyte model. The iPhone is also sold at \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " stores and \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " stores. Moore said \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       "'s move was not in response to other retailers' actions. (Reporting by \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Karen Jacobs\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " ; Editing by \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Andre Grenon\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ENT</span>\n",
       "</mark>\n",
       " )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Detection based on casing\n",
    "proper_detector = skweak.heuristics.TokenConstraintAnnotator(\"proper_detector\", skweak.utils.is_likely_proper, \"ENT\")\n",
    "    \n",
    "# Detection based on casing, but allowing some lowercased tokens\n",
    "proper2_detector = skweak.heuristics.TokenConstraintAnnotator(\"proper2_detector\", skweak.utils.is_likely_proper, \"ENT\")\n",
    "proper2_detector.add_gap_tokens(examples.ner.data_utils.LOWERCASED_TOKENS | examples.ner.data_utils.NAME_PREFIXES)\n",
    "#add  .ner.        \n",
    "# Detection based on part-of-speech tags\n",
    "nnp_detector = skweak.heuristics.TokenConstraintAnnotator(\"nnp_detector\", lambda tok: tok.tag_==\"NNP\", \"ENT\")\n",
    "        \n",
    "# Detection based on dependency relations (compound phrases)\n",
    "compound = lambda tok: skweak.utils.is_likely_proper(tok) and skweak.utils.in_compound(tok)\n",
    "compound_detector = skweak.heuristics.TokenConstraintAnnotator(\"compound_detector\", compound, \"ENT\")\n",
    " \n",
    "combined = skweak.base.CombinedAnnotator()\n",
    "exclusives = [\"date_detector\", \"time_detector\", \"money_detector\"]\n",
    "for annotator in [proper_detector, proper2_detector, nnp_detector, compound_detector]:\n",
    "    annotator.add_incompatible_sources(exclusives)\n",
    "    annotator.add_gap_tokens([\"'s\", \"-\"])\n",
    "    combined.add_annotator(annotator)\n",
    "\n",
    "    # We add one variants for each NE detector, looking at infrequent tokens\n",
    "    infrequent_name = \"infrequent_%s\"%annotator.name\n",
    "    combined.add_annotator(skweak.heuristics.SpanConstraintAnnotator(infrequent_name, annotator.name, skweak.utils.is_infrequent))\n",
    "\n",
    "doc = combined(doc)\n",
    "skweak.utils.display_entities(doc, \"proper_detector\")\n",
    "skweak.utils.display_entities(doc, \"proper2_detector\")\n",
    "skweak.utils.display_entities(doc, \"nnp_detector\")\n",
    "skweak.utils.display_entities(doc, \"compound_detector\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, I created three specific annotators:\n",
    "- to recognise company names with a legal type\n",
    "- full person names (with a first name along a list of common first names)\n",
    "- slightly less common entities such as `NORP`, `FAC`, `LANGUAGE`, `EVENT` and `LAW`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> ATLANTA (Reuters) - \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Retailer Best Buy Co\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">COMPANY</span>\n",
       "</mark>\n",
       ", seeking new ways to appeal to cost-conscious shoppers, said on Tuesday it is selling refurbished versions of Apple Inc's iPhone 3G at its stores that are priced about $50 less than new iPhones. The electronics chain said the used iPhones, which were returned within 30 days of purchase, are priced at $149 for the model with 8 gigabytes of storage, while the 16-gigabyte version is $249. A two-year service contract with \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T Inc\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">COMPANY</span>\n",
       "</mark>\n",
       " is required. New iPhone 3Gs currently sell for $199 and $299 at Best Buy Mobile stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said Scott Moore, vice president of marketing for Best Buy Mobile. Buyers of first-generation iPhones can also upgrade to the faster refurbished 3G models at Best Buy, he said. Moore said AT&amp;T, the exclusive wireless provider for the iPhone, offers refurbished iPhones online. The sale of used iPhones comes as Best Buy, the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Wal-Mart Stores Inc\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">COMPANY</span>\n",
       "</mark>\n",
       ", which began selling the popular phone late last month. Wal-Mart sells a new 8-gigabyte iPhone 3G for $197 and $297 for the 16-gigabyte model. The iPhone is also sold at Apple stores and AT&amp;T stores. Moore said Best Buy's move was not in response to other retailers' actions. (Reporting by Karen Jacobs ; Editing by Andre Grenon )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> ATLANTA (Reuters) - Retailer Best Buy Co, seeking new ways to appeal to cost-conscious shoppers, said on Tuesday it is selling refurbished versions of Apple Inc's iPhone 3G at its stores that are priced about $50 less than new iPhones. The electronics chain said the used iPhones, which were returned within 30 days of purchase, are priced at $149 for the model with 8 gigabytes of storage, while the 16-gigabyte version is $249. A two-year service contract with AT&amp;T Inc is required. New iPhone 3Gs currently sell for $199 and $299 at Best Buy Mobile stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said \n",
       "<mark class=\"entity\" style=\"background: #aa9cfc; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Scott Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PERSON</span>\n",
       "</mark>\n",
       ", vice president of marketing for Best Buy Mobile. Buyers of first-generation iPhones can also upgrade to the faster refurbished 3G models at Best Buy, he said. Moore said AT&amp;T, the exclusive wireless provider for the iPhone, offers refurbished iPhones online. The sale of used iPhones comes as Best Buy, the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as Wal-Mart Stores Inc, which began selling the popular phone late last month. Wal-Mart sells a new 8-gigabyte iPhone 3G for $197 and $297 for the 16-gigabyte model. The iPhone is also sold at Apple stores and AT&amp;T stores. Moore said Best Buy's move was not in response to other retailers' actions. (Reporting by Karen Jacobs ; Editing by \n",
       "<mark class=\"entity\" style=\"background: #aa9cfc; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Andre Grenon\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PERSON</span>\n",
       "</mark>\n",
       " )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> ATLANTA (Reuters) - Retailer Best Buy Co, seeking new ways to appeal to cost-conscious shoppers, said on Tuesday it is selling refurbished versions of Apple Inc's iPhone 3G at its stores that are priced about $50 less than new iPhones. The electronics chain said the used iPhones, which were returned within 30 days of purchase, are priced at $149 for the model with 8 gigabytes of storage, while the 16-gigabyte version is $249. A two-year service contract with AT&amp;T Inc is required. New iPhone 3Gs currently sell for $199 and $299 at Best Buy Mobile stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said Scott Moore, vice president of marketing for Best Buy Mobile. Buyers of first-generation iPhones can also upgrade to the faster refurbished 3G models at Best Buy, he said. Moore said AT&amp;T, the exclusive wireless provider for the iPhone, offers refurbished iPhones online. The sale of used iPhones comes as Best Buy, the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as Wal-Mart Stores Inc, which began selling the popular phone late last month. Wal-Mart sells a new 8-gigabyte iPhone 3G for $197 and $297 for the 16-gigabyte model. The iPhone is also sold at Apple stores and AT&amp;T stores. Moore said Best Buy's move was not in response to other retailers' actions. (Reporting by Karen Jacobs ; Editing by Andre Grenon )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> ATLANTA (Reuters) - Retailer Best Buy Co, seeking new ways to appeal to cost-conscious shoppers, said on Tuesday it is selling refurbished versions of Apple Inc's iPhone 3G at its stores that are priced about $50 less than new iPhones. The electronics chain said the used iPhones, which were returned within 30 days of purchase, are priced at $149 for the model with 8 gigabytes of storage, while the 16-gigabyte version is $249. A two-year service contract with AT&amp;T Inc is required. New iPhone 3Gs currently sell for $199 and $299 at Best Buy Mobile stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said Scott Moore, vice president of marketing for Best Buy Mobile. Buyers of first-generation iPhones can also upgrade to the faster refurbished 3G models at Best Buy, he said. Moore said AT&amp;T, the exclusive wireless provider for the iPhone, offers refurbished iPhones online. The sale of used iPhones comes as Best Buy, the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as Wal-Mart Stores Inc, which began selling the popular phone late last month. Wal-Mart sells a new 8-gigabyte iPhone 3G for $197 and $297 for the 16-gigabyte model. The iPhone is also sold at Apple stores and AT&amp;T stores. Moore said Best Buy's move was not in response to other retailers' actions. (Reporting by Karen Jacobs ; Editing by Andre Grenon )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "\n",
    "# Other types (legal references etc.)      \n",
    "misc_detector = skweak.heuristics.FunctionAnnotator(\"misc_detector\", examples.ner.conll2003_ner.misc_generator)\n",
    "legal_detector = skweak.heuristics.FunctionAnnotator(\"legal_detector\", examples.ner.conll2003_ner.legal_generator)\n",
    "        \n",
    "# Detection of companies with a legal type\n",
    "ends_with_legal_suffix = lambda x: x[-1].lower_.rstrip(\".\") in examples.ner.data_utils.LEGAL_SUFFIXES\n",
    "company_type_detector = skweak.heuristics.SpanConstraintAnnotator(\"company_type_detector\", \"proper2_detector\", \n",
    "                                                    ends_with_legal_suffix, \"COMPANY\")\n",
    "# Detection of full person names\n",
    "FIRSR_NAMES = \"../../data/first_names.json\"\n",
    "full_name_detector = skweak.heuristics.SpanConstraintAnnotator(\"full_name_detector\", \"proper2_detector\", \n",
    "                                                     examples.ner.conll2003_ner.FullNameDetector(), \"PERSON\")\n",
    "\n",
    "\n",
    "legal_detector(doc)\n",
    "company_type_detector(doc)\n",
    "full_name_detector(doc)\n",
    "misc_detector(doc)\n",
    "skweak.utils.display_entities(doc, \"company_type_detector\")\n",
    "skweak.utils.display_entities(doc, \"full_name_detector\")\n",
    "skweak.utils.display_entities(doc, \"misc_detector\")\n",
    "skweak.utils.display_entities(doc, \"legal_detector\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, we also rely on an external probabilistic [parser of named entities](https://github.com/snipsco/snips-nlu-parsers) from [Snips](https://snips.ai/). The parser recognises `DATE`, `TIME`, `ORDINAL`, `CARDINAL`, `MONEY` and `PERCENT`. The parser is implemented in _Rust_, so it runs quite fast."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> ATLANTA (Reuters) - Retailer Best Buy Co, seeking new ways to appeal to cost-conscious shoppers, said \n",
       "<mark class=\"entity\" style=\"background: #bfe1d9; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    on Tuesday\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">DATE</span>\n",
       "</mark>\n",
       " it is selling refurbished versions of Apple Inc's iPhone \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    3G\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">CARDINAL</span>\n",
       "</mark>\n",
       " at its stores that are priced \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    about $50\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " less than new iPhones. The electronics chain said the used iPhones, which were returned \n",
       "<mark class=\"entity\" style=\"background: #bfe1d9; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    within 30 days\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">TIME</span>\n",
       "</mark>\n",
       " of purchase, are priced at \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    $149\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " for the model with \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    8\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">CARDINAL</span>\n",
       "</mark>\n",
       " gigabytes of storage, while \n",
       "<mark class=\"entity\" style=\"background: #bfe1d9; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    the 16\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">DATE</span>\n",
       "</mark>\n",
       "-gigabyte version is \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    $249\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       ". A \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    two\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">CARDINAL</span>\n",
       "</mark>\n",
       "-year service contract with AT&amp;T Inc is required. New iPhone 3Gs currently sell for \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    $199\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " and \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    $299\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " at Best Buy Mobile stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said Scott Moore, vice president of marketing for Best Buy Mobile. Buyers of first-generation iPhones can also upgrade to the faster refurbished \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    3G\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">CARDINAL</span>\n",
       "</mark>\n",
       " models at Best Buy, he said. Moore said AT&amp;T, the exclusive wireless provider for the iPhone, offers refurbished iPhones online. The sale of used iPhones comes as Best Buy, the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as Wal-Mart Stores Inc, which began selling the popular phone late \n",
       "<mark class=\"entity\" style=\"background: #bfe1d9; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    last month\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">DATE</span>\n",
       "</mark>\n",
       ". Wal-Mart sells a new \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    8\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">CARDINAL</span>\n",
       "</mark>\n",
       "-gigabyte iPhone \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    3G\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">CARDINAL</span>\n",
       "</mark>\n",
       " for \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    $197\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " and \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    $297\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MONEY</span>\n",
       "</mark>\n",
       " \n",
       "<mark class=\"entity\" style=\"background: #bfe1d9; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    for the 16\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">TIME</span>\n",
       "</mark>\n",
       "-gigabyte model. The iPhone is also sold at Apple stores and AT&amp;T stores. Moore said Best Buy's move was not in response to other retailers' actions. (Reporting by Karen Jacobs ; Editing by Andre Grenon )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Detection based on a probabilistic parser\n",
    "# NB: requires to install \"snips-nlu-parsers\" (pip install snips-nlu-parsers)\n",
    "snips = examples.ner.conll2003_ner.SnipsAnnotator(\"snips\")\n",
    "snips(doc)\n",
    "skweak.utils.display_entities(doc, \"snips\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 4. Document-level annotators\n",
    "\n",
    "All annotators presented so far rely on _local_ decisions on tokens or phrases.  However, news articles are not mere collections of words, but exhibit a high degree of internal coherence. This can be exploited to furhter improve the annotation. Two document-level annotators are implemented:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Before we can run the document-level annotators, we need to normalise some of the entities. The `ConLL2003Standardiser` is responsible for this normalisation:\n",
    "- entities `PER` (from conll2003, BTC and SEC) are set to `PERSON`\n",
    "- entities `LOC` from conll2003, BTC and SEC for spans that are also annotated by other layers as `GPE` are set to `GPE` \n",
    "- entities `ORG` that are annotated by other layers as `COMPANY` are set to `COMPANY`\n",
    "    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [],
   "source": [
    "annotator = examples.ner.conll2003_ner.ConLL2003Standardiser()\n",
    "doc = annotator(doc)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><style>\n",
       ".tooltip {  position: relative;  border-bottom: 1px dotted black; }\n",
       ".tooltip .tooltip-text {visibility: hidden;  background-color: black;  color: white;\n",
       "                        line-height: 1.2;  text-align: right;  border-radius: 6px;\n",
       "                        padding: 5px 0; position: absolute; z-index: 1; margin-left:1em;\n",
       "                        opacity: 0; transition: opacity 1s;}\n",
       ".tooltip .tooltip-text::after {position: absolute; top: 1.5em; right: 100%; margin-top: -5px;\n",
       "                               border-width: 5px; border-style: solid; \n",
       "                               border-color: transparent black transparent transparent;}\n",
       ".tooltip:hover .tooltip-text {visibility: visible; opacity: 1;}\n",
       "</style>\n",
       "<div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> \n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>ATLANTA<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>mv:\tLOC&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tLOC&nbsp;&nbsp<br>wikismall_uncased:\tLOC&nbsp;&nbsp<br>geo_uncased:\tLOC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       " (\n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Reuters<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tORG&nbsp;&nbsp<br>wikismall_cased:\tORG&nbsp;&nbsp<br>wikismall_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ") - \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Retailer<span class='tooltip-text' style='width:259px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Best<span class='tooltip-text' style='width:259px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Buy<span class='tooltip-text' style='width:259px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Co<span class='tooltip-text' style='width:259px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", seeking new ways to appeal to cost-conscious shoppers, said <label class='tooltip'>on<span class='tooltip-text' style='width:154px'>snips:\tDATE&nbsp;&nbsp</span></label> <label class='tooltip'>Tuesday<span class='tooltip-text' style='width:210px'>date_detector:\tDATE&nbsp;&nbsp<br>snips:\tDATE&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp</span></label> it is selling refurbished versions of \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Apple<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Inc<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp</span></label><label class='tooltip'>'s<span class='tooltip-text' style='width:224px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhone<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>mv:\tMISC&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tMISC&nbsp;&nbsp<br>wikismall_cased:\tMISC&nbsp;&nbsp<br>wikismall_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp</span></label> <label class='tooltip'>3<span class='tooltip-text' style='width:252px'>number_detector:\tQUANTITY&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp<br>mv:\tMISC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tMISC&nbsp;&nbsp<br>wikismall_cased:\tMISC&nbsp;&nbsp<br>wikismall_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp</span></label><label class='tooltip'>G<span class='tooltip-text' style='width:252px'>number_detector:\tQUANTITY&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp<br>mv:\tMISC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tMISC&nbsp;&nbsp<br>wikismall_cased:\tMISC&nbsp;&nbsp<br>wikismall_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       " at its stores that are priced <label class='tooltip'>about<span class='tooltip-text' style='width:203px'>snips:\tMONEY&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp</span></label> <label class='tooltip'>$<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp</span></label>\n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>50<span class='tooltip-text' style='width:238px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " less than new \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhones<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>mv:\tMISC&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tMISC&nbsp;&nbsp<br>wikismall_cased:\tMISC&nbsp;&nbsp<br>wikismall_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       ". The electronics chain said the used \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhones<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>mv:\tMISC&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tMISC&nbsp;&nbsp<br>wikismall_cased:\tMISC&nbsp;&nbsp<br>wikismall_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       ", which were returned <label class='tooltip'>within<span class='tooltip-text' style='width:154px'>snips:\tTIME&nbsp;&nbsp</span></label> <label class='tooltip'>30<span class='tooltip-text' style='width:252px'>number_detector:\tCARDINAL&nbsp;&nbsp<br>snips:\tTIME&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp</span></label> <label class='tooltip'>days<span class='tooltip-text' style='width:196px'>snips:\tTIME&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp</span></label> of purchase, are priced at <label class='tooltip'>$<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp</span></label><label class='tooltip'>149<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp</span></label> for the model with <label class='tooltip'>8<span class='tooltip-text' style='width:252px'>number_detector:\tQUANTITY&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp<br>core_web_md:\tQUANTITY&nbsp;&nbsp</span></label> <label class='tooltip'>gigabytes<span class='tooltip-text' style='width:252px'>number_detector:\tQUANTITY&nbsp;&nbsp<br>core_web_md:\tQUANTITY&nbsp;&nbsp</span></label> of storage, while <label class='tooltip'>the<span class='tooltip-text' style='width:154px'>snips:\tDATE&nbsp;&nbsp</span></label> <label class='tooltip'>16<span class='tooltip-text' style='width:252px'>number_detector:\tCARDINAL&nbsp;&nbsp<br>snips:\tDATE&nbsp;&nbsp<br>core_web_md:\tQUANTITY&nbsp;&nbsp</span></label><label class='tooltip'>-<span class='tooltip-text' style='width:224px'>core_web_md:\tQUANTITY&nbsp;&nbsp</span></label><label class='tooltip'>gigabyte<span class='tooltip-text' style='width:224px'>core_web_md:\tQUANTITY&nbsp;&nbsp</span></label> version is <label class='tooltip'>$<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp</span></label><label class='tooltip'>249<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp</span></label>. A <label class='tooltip'>two<span class='tooltip-text' style='width:196px'>snips:\tCARDINAL&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp</span></label><label class='tooltip'>-<span class='tooltip-text' style='width:196px'>core_web_md:\tDATE&nbsp;&nbsp</span></label><label class='tooltip'>year<span class='tooltip-text' style='width:196px'>core_web_md:\tDATE&nbsp;&nbsp</span></label> service contract with \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>AT&amp;T<span class='tooltip-text' style='width:259px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Inc<span class='tooltip-text' style='width:259px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " is required. \n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>New<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>mv:\tLOC&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp</span></label> <label class='tooltip'>iPhone<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>mv:\tLOC&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       " <label class='tooltip'>3Gs<span class='tooltip-text' style='width:252px'>number_detector:\tCARDINAL&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp</span></label> currently sell for <label class='tooltip'>$<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp</span></label><label class='tooltip'>199<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp</span></label> and <label class='tooltip'>$<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp</span></label><label class='tooltip'>299<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp</span></label> at \n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Best<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>mv:\tLOC&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Buy<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>mv:\tLOC&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Mobile<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>mv:\tLOC&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       " stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Scott<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>full_name_detector:\tPER&nbsp;&nbsp<br>mv:\tPER&nbsp;&nbsp<br>conll2003:\tPER&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp<br>wiki:\tPER&nbsp;&nbsp<br>wikismall_cased:\tPER&nbsp;&nbsp<br>wikismall_uncased:\tPER&nbsp;&nbsp<br>crunchbase_cased:\tPER&nbsp;&nbsp<br>crunchbase_uncased:\tPER&nbsp;&nbsp</span></label> <label class='tooltip'>Moore<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>full_name_detector:\tPER&nbsp;&nbsp<br>mv:\tPER&nbsp;&nbsp<br>conll2003:\tPER&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp<br>wiki:\tPER&nbsp;&nbsp<br>wikismall_cased:\tPER&nbsp;&nbsp<br>wikismall_uncased:\tPER&nbsp;&nbsp<br>crunchbase_cased:\tPER&nbsp;&nbsp<br>crunchbase_uncased:\tPER&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       ", vice president of marketing for \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Best<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Buy<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Mobile<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ". Buyers of <label class='tooltip'>first<span class='tooltip-text' style='width:217px'>core_web_md:\tORDINAL&nbsp;&nbsp</span></label>-generation \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhones<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " can also upgrade to the faster refurbished <label class='tooltip'>3<span class='tooltip-text' style='width:252px'>number_detector:\tQUANTITY&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp<br>core_web_md:\tCARDINAL&nbsp;&nbsp</span></label><label class='tooltip'>G<span class='tooltip-text' style='width:252px'>number_detector:\tQUANTITY&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp</span></label> models at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Best<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tORG&nbsp;&nbsp<br>wikismall_cased:\tORG&nbsp;&nbsp<br>wikismall_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Buy<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tORG&nbsp;&nbsp<br>wikismall_cased:\tORG&nbsp;&nbsp<br>wikismall_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", he said. \n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Moore<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>mv:\tLOC&nbsp;&nbsp<br>conll2003:\tPER&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp<br>wiki:\tLOC&nbsp;&nbsp<br>wikismall_cased:\tLOC&nbsp;&nbsp<br>wikismall_uncased:\tLOC&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>AT&amp;T<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>conll2003:\tPER&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tLOC&nbsp;&nbsp<br>wiki:\tORG&nbsp;&nbsp<br>wikismall_cased:\tORG&nbsp;&nbsp<br>wikismall_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", the exclusive wireless provider for the \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhone<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>mv:\tMISC&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tMISC&nbsp;&nbsp<br>wikismall_cased:\tMISC&nbsp;&nbsp<br>wikismall_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       ", offers refurbished \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhones<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>mv:\tMISC&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>wiki:\tMISC&nbsp;&nbsp<br>wikismall_cased:\tMISC&nbsp;&nbsp<br>wikismall_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       " online. The sale of used \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhones<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>mv:\tMISC&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tMISC&nbsp;&nbsp<br>wiki:\tMISC&nbsp;&nbsp<br>wikismall_cased:\tMISC&nbsp;&nbsp<br>wikismall_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       " comes as \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Best<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tORG&nbsp;&nbsp<br>wikismall_cased:\tORG&nbsp;&nbsp<br>wikismall_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Buy<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tORG&nbsp;&nbsp<br>wikismall_cased:\tORG&nbsp;&nbsp<br>wikismall_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Wal<span class='tooltip-text' style='width:259px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp</span></label><label class='tooltip'>-<span class='tooltip-text' style='width:259px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp</span></label><label class='tooltip'>Mart<span class='tooltip-text' style='width:259px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Stores<span class='tooltip-text' style='width:259px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Inc<span class='tooltip-text' style='width:259px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", which began selling the popular phone <label class='tooltip'>late<span class='tooltip-text' style='width:196px'>core_web_md:\tDATE&nbsp;&nbsp</span></label> <label class='tooltip'>last<span class='tooltip-text' style='width:196px'>snips:\tDATE&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp</span></label> <label class='tooltip'>month<span class='tooltip-text' style='width:196px'>snips:\tDATE&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp</span></label>. \n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Wal<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>mv:\tLOC&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp</span></label><label class='tooltip'>-<span class='tooltip-text' style='width:224px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>mv:\tLOC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp</span></label><label class='tooltip'>Mart<span class='tooltip-text' style='width:224px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>mv:\tLOC&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       " sells a new <label class='tooltip'>8<span class='tooltip-text' style='width:252px'>number_detector:\tCARDINAL&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp<br>core_web_md:\tQUANTITY&nbsp;&nbsp</span></label><label class='tooltip'>-<span class='tooltip-text' style='width:224px'>core_web_md:\tQUANTITY&nbsp;&nbsp</span></label><label class='tooltip'>gigabyte<span class='tooltip-text' style='width:224px'>core_web_md:\tQUANTITY&nbsp;&nbsp</span></label> \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhone<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>mv:\tMISC&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tMISC&nbsp;&nbsp<br>wikismall_cased:\tMISC&nbsp;&nbsp<br>wikismall_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp</span></label> <label class='tooltip'>3<span class='tooltip-text' style='width:252px'>number_detector:\tQUANTITY&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp<br>mv:\tMISC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tMISC&nbsp;&nbsp<br>wikismall_cased:\tMISC&nbsp;&nbsp<br>wikismall_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp</span></label><label class='tooltip'>G<span class='tooltip-text' style='width:252px'>number_detector:\tQUANTITY&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp<br>mv:\tMISC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tMISC&nbsp;&nbsp<br>wikismall_cased:\tMISC&nbsp;&nbsp<br>wikismall_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       " for <label class='tooltip'>$<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp</span></label><label class='tooltip'>197<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp</span></label> and <label class='tooltip'>$<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp</span></label><label class='tooltip'>297<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp</span></label> <label class='tooltip'>for<span class='tooltip-text' style='width:154px'>snips:\tTIME&nbsp;&nbsp</span></label> <label class='tooltip'>the<span class='tooltip-text' style='width:154px'>snips:\tTIME&nbsp;&nbsp</span></label> <label class='tooltip'>16<span class='tooltip-text' style='width:252px'>number_detector:\tCARDINAL&nbsp;&nbsp<br>snips:\tTIME&nbsp;&nbsp<br>core_web_md:\tQUANTITY&nbsp;&nbsp</span></label><label class='tooltip'>-<span class='tooltip-text' style='width:224px'>core_web_md:\tQUANTITY&nbsp;&nbsp</span></label><label class='tooltip'>gigabyte<span class='tooltip-text' style='width:224px'>core_web_md:\tQUANTITY&nbsp;&nbsp</span></label> model. The \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhone<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>mv:\tMISC&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tMISC&nbsp;&nbsp<br>wikismall_cased:\tMISC&nbsp;&nbsp<br>wikismall_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       " is also sold at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Apple<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tORG&nbsp;&nbsp<br>wikismall_cased:\tORG&nbsp;&nbsp<br>wikismall_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores and \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>AT&amp;T<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tLOC&nbsp;&nbsp<br>wiki:\tORG&nbsp;&nbsp<br>wikismall_cased:\tORG&nbsp;&nbsp<br>wikismall_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores. \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Moore<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>conll2003:\tPER&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tLOC&nbsp;&nbsp<br>wikismall_cased:\tLOC&nbsp;&nbsp<br>wikismall_uncased:\tLOC&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Best<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>wiki:\tORG&nbsp;&nbsp<br>wikismall_cased:\tORG&nbsp;&nbsp<br>wikismall_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Buy<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>mv:\tORG&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>wiki:\tORG&nbsp;&nbsp<br>wikismall_cased:\tORG&nbsp;&nbsp<br>wikismall_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       "'s move was not in response to other retailers' actions. (<label class='tooltip'>Reporting<span class='tooltip-text' style='width:224px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp</span></label> <label class='tooltip'>by<span class='tooltip-text' style='width:224px'>proper2_detector:\tENT&nbsp;&nbsp</span></label> \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Karen<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>mv:\tPER&nbsp;&nbsp<br>conll2003:\tPER&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp<br>crunchbase_cased:\tPER&nbsp;&nbsp<br>crunchbase_uncased:\tPER&nbsp;&nbsp</span></label> <label class='tooltip'>Jacobs<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>mv:\tPER&nbsp;&nbsp<br>conll2003:\tPER&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp<br>crunchbase_cased:\tPER&nbsp;&nbsp<br>crunchbase_uncased:\tPER&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       " ; Editing by \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Andre<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>full_name_detector:\tPER&nbsp;&nbsp<br>mv:\tPER&nbsp;&nbsp<br>conll2003:\tPER&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp</span></label> <label class='tooltip'>Grenon<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>full_name_detector:\tPER&nbsp;&nbsp<br>mv:\tPER&nbsp;&nbsp<br>conll2003:\tPER&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       " )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "mv = skweak.aggregation.MajorityVoter(\"mv\", [\"LOC\", \"MISC\", \"ORG\", \"PER\"])\n",
    "mv.add_underspecified_label(\"ENT\", {\"LOC\", \"MISC\", \"ORG\", \"PER\"})\n",
    "doc = mv(doc)\n",
    "skweak.utils.display_entities(doc, \"mv\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 4.1 Document history\n",
    "\n",
    "When a journalist first mentions an entity such as a company or person in an article, they typically write it in a \"long form\", and then use shorter mentions once the entity is properly introduced. For instance, in the text above, \"Scott Moore\" is first mentioned with a full name, and then simply referred to as \"Moore\". Similarly, companies are often first introduced to with their legal type.  The `DocumentHistoryAnnotator` takes advantage of this property, by propagating the label from the first mention onto subsequent mentions:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> ATLANTA (Reuters) - Retailer Best Buy Co, seeking new ways to appeal to cost-conscious shoppers, said on Tuesday it is selling refurbished versions of Apple Inc's iPhone 3G at its stores that are priced about $50 less than new iPhones. The electronics chain said the used iPhones, which were returned within 30 days of purchase, are priced at $149 for the model with 8 gigabytes of storage, while the 16-gigabyte version is $249. A two-year service contract with AT&amp;T Inc is required. New iPhone 3Gs currently sell for $199 and $299 at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy Mobile\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said Scott Moore, vice president of marketing for Best Buy Mobile. Buyers of first-generation iPhones can also upgrade to the faster refurbished 3G models at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", he said. \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", the exclusive wireless provider for the iPhone, offers refurbished iPhones online. The sale of used iPhones comes as \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as Wal-Mart Stores Inc, which began selling the popular phone late last month. \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Wal-Mart\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " sells a new 8-gigabyte iPhone 3G for $197 and $297 for the 16-gigabyte model. The iPhone is also sold at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores and \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores. \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       "'s move was not in response to other retailers' actions. (Reporting by Karen Jacobs ; Editing by Andre Grenon )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "annotator = skweak.doclevel.DocumentHistoryAnnotator(\"doc_history\", \"mv\", [\"PER\", \"ORG\"])\n",
    "annotator(doc)\n",
    "skweak.utils.display_entities(doc, \"doc_history\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 4.2 Label consistency\n",
    "\n",
    "Another property of news documents is the fact that two (or more) named entities sharing the same string in a text typically refer to the same entity, and should therefore have the same label. \"Komatsu\" can be both a company name and a city in Japan, but within a given document, it will typically be one or the other for the whole document. We can capture this fact with an annotator that looks at the majority label for a given string, and annotate all occurrences with this label:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> ATLANTA (Reuters) - Retailer \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " Co, seeking new ways to appeal to cost-conscious shoppers, said on Tuesday it is selling refurbished versions of \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " Inc's \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone 3G\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       " at its stores that are priced about $50 less than new \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       ". The electronics chain said the used \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       ", which were returned within 30 days of purchase, are priced at $149 for the model with 8 gigabytes of storage, while the 16-gigabyte version is $249. A two-year service contract with \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " Inc is required. New iPhone 3Gs currently sell for $199 and $299 at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy Mobile\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said Scott Moore, vice president of marketing for \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy Mobile\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ". Buyers of first-generation iPhones can also upgrade to the faster refurbished 3G models at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", he said. \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", the exclusive wireless provider for the \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       ", offers refurbished \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       " online. The sale of used \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       " comes as \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as \n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Wal-Mart\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       " Stores Inc, which began selling the popular phone late last month. \n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Wal-Mart\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       " sells a new 8-gigabyte \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone 3G\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       " for $197 and $297 for the 16-gigabyte model. The \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       " is also sold at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores and \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores. \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       "'s move was not in response to other retailers' actions. (Reporting by Karen Jacobs ; Editing by Andre Grenon )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "annotator = skweak.doclevel.DocumentMajorityAnnotator(\"doc_majority\", \"mv\")\n",
    "annotator(doc)\n",
    "skweak.utils.display_entities(doc, \"doc_majority\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><style>\n",
       ".tooltip {  position: relative;  border-bottom: 1px dotted black; }\n",
       ".tooltip .tooltip-text {visibility: hidden;  background-color: black;  color: white;\n",
       "                        line-height: 1.2;  text-align: right;  border-radius: 6px;\n",
       "                        padding: 5px 0; position: absolute; z-index: 1; margin-left:1em;\n",
       "                        opacity: 0; transition: opacity 1s;}\n",
       ".tooltip .tooltip-text::after {position: absolute; top: 1.5em; right: 100%; margin-top: -5px;\n",
       "                               border-width: 5px; border-style: solid; \n",
       "                               border-color: transparent black transparent transparent;}\n",
       ".tooltip:hover .tooltip-text {visibility: visible; opacity: 1;}\n",
       "</style>\n",
       "<div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> \n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>ATLANTA<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tLOC&nbsp;&nbsp<br>wikismall_uncased:\tLOC&nbsp;&nbsp<br>geo_uncased:\tLOC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       " (\n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Reuters<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tORG&nbsp;&nbsp<br>wikismall_cased:\tORG&nbsp;&nbsp<br>wikismall_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ") - \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Retailer<span class='tooltip-text' style='width:259px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Best<span class='tooltip-text' style='width:259px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>doc_majority:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Buy<span class='tooltip-text' style='width:259px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>doc_majority:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Co<span class='tooltip-text' style='width:259px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", seeking new ways to appeal to cost-conscious shoppers, said <label class='tooltip'>on<span class='tooltip-text' style='width:154px'>snips:\tDATE&nbsp;&nbsp</span></label> <label class='tooltip'>Tuesday<span class='tooltip-text' style='width:210px'>date_detector:\tDATE&nbsp;&nbsp<br>snips:\tDATE&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp</span></label> it is selling refurbished versions of \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Apple<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>doc_majority:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Inc<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp</span></label><label class='tooltip'>'s<span class='tooltip-text' style='width:224px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhone<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tMISC&nbsp;&nbsp<br>wikismall_cased:\tMISC&nbsp;&nbsp<br>wikismall_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp<br>doc_majority:\tMISC&nbsp;&nbsp</span></label> <label class='tooltip'>3<span class='tooltip-text' style='width:252px'>number_detector:\tQUANTITY&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tMISC&nbsp;&nbsp<br>wikismall_cased:\tMISC&nbsp;&nbsp<br>wikismall_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp<br>doc_majority:\tMISC&nbsp;&nbsp</span></label><label class='tooltip'>G<span class='tooltip-text' style='width:252px'>number_detector:\tQUANTITY&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tMISC&nbsp;&nbsp<br>wikismall_cased:\tMISC&nbsp;&nbsp<br>wikismall_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp<br>doc_majority:\tMISC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       " at its stores that are priced <label class='tooltip'>about<span class='tooltip-text' style='width:203px'>snips:\tMONEY&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp</span></label> <label class='tooltip'>$<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp</span></label>\n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>50<span class='tooltip-text' style='width:238px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " less than new \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhones<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tMISC&nbsp;&nbsp<br>wikismall_cased:\tMISC&nbsp;&nbsp<br>wikismall_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp<br>doc_majority:\tMISC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       ". The electronics chain said the used \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhones<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tMISC&nbsp;&nbsp<br>wikismall_cased:\tMISC&nbsp;&nbsp<br>wikismall_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp<br>doc_majority:\tMISC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       ", which were returned <label class='tooltip'>within<span class='tooltip-text' style='width:154px'>snips:\tTIME&nbsp;&nbsp</span></label> <label class='tooltip'>30<span class='tooltip-text' style='width:252px'>number_detector:\tCARDINAL&nbsp;&nbsp<br>snips:\tTIME&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp</span></label> <label class='tooltip'>days<span class='tooltip-text' style='width:196px'>snips:\tTIME&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp</span></label> of purchase, are priced at <label class='tooltip'>$<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp</span></label><label class='tooltip'>149<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp</span></label> for the model with <label class='tooltip'>8<span class='tooltip-text' style='width:252px'>number_detector:\tQUANTITY&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp<br>core_web_md:\tQUANTITY&nbsp;&nbsp</span></label> <label class='tooltip'>gigabytes<span class='tooltip-text' style='width:252px'>number_detector:\tQUANTITY&nbsp;&nbsp<br>core_web_md:\tQUANTITY&nbsp;&nbsp</span></label> of storage, while <label class='tooltip'>the<span class='tooltip-text' style='width:154px'>snips:\tDATE&nbsp;&nbsp</span></label> <label class='tooltip'>16<span class='tooltip-text' style='width:252px'>number_detector:\tCARDINAL&nbsp;&nbsp<br>snips:\tDATE&nbsp;&nbsp<br>core_web_md:\tQUANTITY&nbsp;&nbsp</span></label><label class='tooltip'>-<span class='tooltip-text' style='width:224px'>core_web_md:\tQUANTITY&nbsp;&nbsp</span></label><label class='tooltip'>gigabyte<span class='tooltip-text' style='width:224px'>core_web_md:\tQUANTITY&nbsp;&nbsp</span></label> version is <label class='tooltip'>$<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp</span></label><label class='tooltip'>249<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp</span></label>. A <label class='tooltip'>two<span class='tooltip-text' style='width:196px'>snips:\tCARDINAL&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp</span></label><label class='tooltip'>-<span class='tooltip-text' style='width:196px'>core_web_md:\tDATE&nbsp;&nbsp</span></label><label class='tooltip'>year<span class='tooltip-text' style='width:196px'>core_web_md:\tDATE&nbsp;&nbsp</span></label> service contract with \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>AT&amp;T<span class='tooltip-text' style='width:259px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>doc_majority:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Inc<span class='tooltip-text' style='width:259px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " is required. \n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>New<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp</span></label> <label class='tooltip'>iPhone<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       " <label class='tooltip'>3Gs<span class='tooltip-text' style='width:252px'>number_detector:\tCARDINAL&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp</span></label> currently sell for <label class='tooltip'>$<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp</span></label><label class='tooltip'>199<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp</span></label> and <label class='tooltip'>$<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp</span></label><label class='tooltip'>299<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp</span></label> at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Best<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>doc_history:\tORG&nbsp;&nbsp<br>doc_majority:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Buy<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>doc_history:\tORG&nbsp;&nbsp<br>doc_majority:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Mobile<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>doc_history:\tORG&nbsp;&nbsp<br>doc_majority:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Scott<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>full_name_detector:\tPER&nbsp;&nbsp<br>conll2003:\tPER&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp<br>wiki:\tPER&nbsp;&nbsp<br>wikismall_cased:\tPER&nbsp;&nbsp<br>wikismall_uncased:\tPER&nbsp;&nbsp<br>crunchbase_cased:\tPER&nbsp;&nbsp<br>crunchbase_uncased:\tPER&nbsp;&nbsp</span></label> <label class='tooltip'>Moore<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>full_name_detector:\tPER&nbsp;&nbsp<br>conll2003:\tPER&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp<br>wiki:\tPER&nbsp;&nbsp<br>wikismall_cased:\tPER&nbsp;&nbsp<br>wikismall_uncased:\tPER&nbsp;&nbsp<br>crunchbase_cased:\tPER&nbsp;&nbsp<br>crunchbase_uncased:\tPER&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       ", vice president of marketing for \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Best<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>doc_majority:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Buy<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>doc_majority:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Mobile<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>doc_majority:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ". Buyers of <label class='tooltip'>first<span class='tooltip-text' style='width:217px'>core_web_md:\tORDINAL&nbsp;&nbsp</span></label>-generation \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhones<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " can also upgrade to the faster refurbished <label class='tooltip'>3<span class='tooltip-text' style='width:252px'>number_detector:\tQUANTITY&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp<br>core_web_md:\tCARDINAL&nbsp;&nbsp</span></label><label class='tooltip'>G<span class='tooltip-text' style='width:252px'>number_detector:\tQUANTITY&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp</span></label> models at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Best<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tORG&nbsp;&nbsp<br>wikismall_cased:\tORG&nbsp;&nbsp<br>wikismall_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>doc_history:\tORG&nbsp;&nbsp<br>doc_majority:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Buy<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tORG&nbsp;&nbsp<br>wikismall_cased:\tORG&nbsp;&nbsp<br>wikismall_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>doc_history:\tORG&nbsp;&nbsp<br>doc_majority:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", he said. \n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Moore<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>conll2003:\tPER&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp<br>wiki:\tLOC&nbsp;&nbsp<br>wikismall_cased:\tLOC&nbsp;&nbsp<br>wikismall_uncased:\tLOC&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>doc_history:\tPER&nbsp;&nbsp<br>doc_majority:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>AT&amp;T<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>conll2003:\tPER&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tLOC&nbsp;&nbsp<br>wiki:\tORG&nbsp;&nbsp<br>wikismall_cased:\tORG&nbsp;&nbsp<br>wikismall_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>doc_history:\tORG&nbsp;&nbsp<br>doc_majority:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", the exclusive wireless provider for the \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhone<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tMISC&nbsp;&nbsp<br>wikismall_cased:\tMISC&nbsp;&nbsp<br>wikismall_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp<br>doc_majority:\tMISC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       ", offers refurbished \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhones<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>wiki:\tMISC&nbsp;&nbsp<br>wikismall_cased:\tMISC&nbsp;&nbsp<br>wikismall_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp<br>doc_majority:\tMISC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       " online. The sale of used \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhones<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tMISC&nbsp;&nbsp<br>wiki:\tMISC&nbsp;&nbsp<br>wikismall_cased:\tMISC&nbsp;&nbsp<br>wikismall_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp<br>doc_majority:\tMISC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       " comes as \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Best<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tORG&nbsp;&nbsp<br>wikismall_cased:\tORG&nbsp;&nbsp<br>wikismall_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>doc_history:\tORG&nbsp;&nbsp<br>doc_majority:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Buy<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tORG&nbsp;&nbsp<br>wikismall_cased:\tORG&nbsp;&nbsp<br>wikismall_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>doc_history:\tORG&nbsp;&nbsp<br>doc_majority:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as \n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Wal<span class='tooltip-text' style='width:259px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>doc_majority:\tLOC&nbsp;&nbsp</span></label><label class='tooltip'>-<span class='tooltip-text' style='width:259px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>doc_majority:\tLOC&nbsp;&nbsp</span></label><label class='tooltip'>Mart<span class='tooltip-text' style='width:259px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>doc_majority:\tLOC&nbsp;&nbsp</span></label> <label class='tooltip'>Stores<span class='tooltip-text' style='width:259px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Inc<span class='tooltip-text' style='width:259px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       ", which began selling the popular phone <label class='tooltip'>late<span class='tooltip-text' style='width:196px'>core_web_md:\tDATE&nbsp;&nbsp</span></label> <label class='tooltip'>last<span class='tooltip-text' style='width:196px'>snips:\tDATE&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp</span></label> <label class='tooltip'>month<span class='tooltip-text' style='width:196px'>snips:\tDATE&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp</span></label>. \n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Wal<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>doc_history:\tORG&nbsp;&nbsp<br>doc_majority:\tLOC&nbsp;&nbsp</span></label><label class='tooltip'>-<span class='tooltip-text' style='width:224px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>doc_history:\tORG&nbsp;&nbsp<br>doc_majority:\tLOC&nbsp;&nbsp</span></label><label class='tooltip'>Mart<span class='tooltip-text' style='width:224px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>doc_history:\tORG&nbsp;&nbsp<br>doc_majority:\tLOC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       " sells a new <label class='tooltip'>8<span class='tooltip-text' style='width:252px'>number_detector:\tCARDINAL&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp<br>core_web_md:\tQUANTITY&nbsp;&nbsp</span></label><label class='tooltip'>-<span class='tooltip-text' style='width:224px'>core_web_md:\tQUANTITY&nbsp;&nbsp</span></label><label class='tooltip'>gigabyte<span class='tooltip-text' style='width:224px'>core_web_md:\tQUANTITY&nbsp;&nbsp</span></label> \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhone<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tMISC&nbsp;&nbsp<br>wikismall_cased:\tMISC&nbsp;&nbsp<br>wikismall_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp<br>doc_majority:\tMISC&nbsp;&nbsp</span></label> <label class='tooltip'>3<span class='tooltip-text' style='width:252px'>number_detector:\tQUANTITY&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tMISC&nbsp;&nbsp<br>wikismall_cased:\tMISC&nbsp;&nbsp<br>wikismall_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp<br>doc_majority:\tMISC&nbsp;&nbsp</span></label><label class='tooltip'>G<span class='tooltip-text' style='width:252px'>number_detector:\tQUANTITY&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tMISC&nbsp;&nbsp<br>wikismall_cased:\tMISC&nbsp;&nbsp<br>wikismall_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp<br>doc_majority:\tMISC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       " for <label class='tooltip'>$<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp</span></label><label class='tooltip'>197<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp</span></label> and <label class='tooltip'>$<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp</span></label><label class='tooltip'>297<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp</span></label> <label class='tooltip'>for<span class='tooltip-text' style='width:154px'>snips:\tTIME&nbsp;&nbsp</span></label> <label class='tooltip'>the<span class='tooltip-text' style='width:154px'>snips:\tTIME&nbsp;&nbsp</span></label> <label class='tooltip'>16<span class='tooltip-text' style='width:252px'>number_detector:\tCARDINAL&nbsp;&nbsp<br>snips:\tTIME&nbsp;&nbsp<br>core_web_md:\tQUANTITY&nbsp;&nbsp</span></label><label class='tooltip'>-<span class='tooltip-text' style='width:224px'>core_web_md:\tQUANTITY&nbsp;&nbsp</span></label><label class='tooltip'>gigabyte<span class='tooltip-text' style='width:224px'>core_web_md:\tQUANTITY&nbsp;&nbsp</span></label> model. The \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhone<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tMISC&nbsp;&nbsp<br>wikismall_cased:\tMISC&nbsp;&nbsp<br>wikismall_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp<br>doc_majority:\tMISC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       " is also sold at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Apple<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>conll2003:\tLOC&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tORG&nbsp;&nbsp<br>wikismall_cased:\tORG&nbsp;&nbsp<br>wikismall_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>doc_history:\tORG&nbsp;&nbsp<br>doc_majority:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores and \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>AT&amp;T<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tLOC&nbsp;&nbsp<br>wiki:\tORG&nbsp;&nbsp<br>wikismall_cased:\tORG&nbsp;&nbsp<br>wikismall_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>doc_history:\tORG&nbsp;&nbsp<br>doc_majority:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores. \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Moore<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>conll2003:\tPER&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>wiki:\tLOC&nbsp;&nbsp<br>wikismall_cased:\tLOC&nbsp;&nbsp<br>wikismall_uncased:\tLOC&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>doc_history:\tPER&nbsp;&nbsp<br>doc_majority:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Best<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>wiki:\tORG&nbsp;&nbsp<br>wikismall_cased:\tORG&nbsp;&nbsp<br>wikismall_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>doc_history:\tORG&nbsp;&nbsp<br>doc_majority:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Buy<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>conll2003:\tORG&nbsp;&nbsp<br>wiki:\tORG&nbsp;&nbsp<br>wikismall_cased:\tORG&nbsp;&nbsp<br>wikismall_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>doc_history:\tORG&nbsp;&nbsp<br>doc_majority:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       "'s move was not in response to other retailers' actions. (<label class='tooltip'>Reporting<span class='tooltip-text' style='width:224px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp</span></label> <label class='tooltip'>by<span class='tooltip-text' style='width:224px'>proper2_detector:\tENT&nbsp;&nbsp</span></label> \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Karen<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>conll2003:\tPER&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp<br>crunchbase_cased:\tPER&nbsp;&nbsp<br>crunchbase_uncased:\tPER&nbsp;&nbsp</span></label> <label class='tooltip'>Jacobs<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>conll2003:\tPER&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp<br>crunchbase_cased:\tPER&nbsp;&nbsp<br>crunchbase_uncased:\tPER&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       " ; Editing by \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Andre<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>full_name_detector:\tPER&nbsp;&nbsp<br>conll2003:\tPER&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp</span></label> <label class='tooltip'>Grenon<span class='tooltip-text' style='width:238px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>full_name_detector:\tPER&nbsp;&nbsp<br>conll2003:\tPER&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       " )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "mv = skweak.aggregation.MajorityVoter(\"mv\", [\"LOC\", \"MISC\", \"ORG\", \"PER\"])\n",
    "mv.add_underspecified_label(\"ENT\", {\"LOC\", \"MISC\", \"ORG\", \"PER\"})\n",
    "doc = mv(doc)\n",
    "skweak.utils.display_entities(doc, \"mv\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## __Step 2__: Estimation of label model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can construct a full annotator with all annotators described above, and then run it on a dataset such as Reuters, Bloomberg, or Acquire:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading shallow functions\n",
      "Loading Spacy NER models\n",
      "Loading gazetteer supervision modules\n",
      "Extracting data from ../../domains/ner/../../data/wikidata_tokenised.json\n",
      "Populating trie for class PERSON (number: 2621131)\n",
      "Populating trie for class LOC (number: 47104)\n",
      "Populating trie for class GPE (number: 601419)\n",
      "Populating trie for class ORG (number: 295449)\n",
      "Populating trie for class PRODUCT (number: 12457)\n",
      "Extracting data from ../../domains/ner/../../data/wikidata_small_tokenised.json\n",
      "Populating trie for class PERSON (number: 1863434)\n",
      "Populating trie for class LOC (number: 14241)\n",
      "Populating trie for class GPE (number: 273373)\n",
      "Populating trie for class ORG (number: 91341)\n",
      "Populating trie for class PRODUCT (number: 12457)\n",
      "Extracting data from ../../domains/ner/../../data/geonames.json\n",
      "Populating trie for class GPE (number: 15205)\n",
      "Extracting data from ../../domains/ner/../../data/crunchbase.json\n",
      "Populating trie for class COMPANY (number: 788714)\n",
      "Populating trie for class ORG (number: 261)\n",
      "Populating trie for class PERSON (number: 1062669)\n",
      "Extracting data from ../../domains/ner/../../data/products.json\n",
      "Populating trie for class PRODUCT (number: 45362)\n",
      "Loading document-level supervision sources\n",
      "Total number of annotators: 52\n"
     ]
    }
   ],
   "source": [
    "full_annotator = examples.ner.conll2003_ner.NERAnnotator().add_all()\n",
    "print(\"Total number of annotators:\", len(full_annotator.annotators))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can then take the raw data from [Reuters](https://github.com/NorskRegnesentral/skweak/raw/main/data/reuters_small.tar.gz), run Spacy on the textual content, and finally apply the annotator to get annotations from the each source:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [],
   "source": [
    "# We annotate 200 documents, and store them in a Spacy DocBin file\n",
    "docs = list(skweak.utils.docbin_reader(\"../../data/reuters_small.spacy\"))\n",
    "docs list(full_annotator.pipe(docs))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Once this is done, we can finally estimate a unified annotator model through weak supervision. The basic idea is to describe the named entity recognition problem as a _Hidden markov Model_ where the observations are the annotations from each source, and the states correspond to the \"true\" (hidden) labels for each token, as illustrated in Figure 2 in the paper."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Since we don't have access to the true labels for each token, we will rely on _Baum-Welch_ (a variant of EM) to estimate the HMM model through unsupervised training. More specifically, we will need to estimate 3 models:\n",
    "- the initial probabilities $P(Y_0)$ of the labels for the first token of a document\n",
    "- the transition matrix $P(Y_i | Y_{i-1})$ for the labels \n",
    "- the emission models $P(\\lambda_{i,j} | Y_i)$ of observing a particular value $\\lambda_{i,j}$ (say, `B-PER`) from the source $j$ given the true label $Y_i$. In the current model, we assume the emissions to be independent of one another given the true label, to reduce the complexity of the model.\n",
    "\n",
    "Given an annotated dataset, the HMM model can be easily estimated:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Starting iteration 1\n",
      "Finished E-step with 195 documents\n",
      "Starting iteration 2\n",
      "         1     -358446.4555             +nan\n",
      "Finished E-step with 195 documents\n",
      "Starting iteration 3\n",
      "         2     -347323.9735      +11122.4821\n",
      "Finished E-step with 195 documents\n",
      "Starting iteration 4\n",
      "         3     -346892.7074        +431.2661\n",
      "Finished E-step with 195 documents\n",
      "         4     -346777.5166        +115.1908\n"
     ]
    }
   ],
   "source": [
    "unified_model = skweak.aggregation.HMM(\"hmm\", [\"LOC\", \"MISC\", \"ORG\", \"PER\"])\n",
    "unified_model.add_underspecified_label(\"ENT\", [\"LOC\", \"MISC\", \"ORG\", \"PER\"])\n",
    "# We then run Baum-Welch on the model (can take some time)\n",
    "unified_model.fit(docs)\n",
    "\n",
    "# Saving the model to a file\n",
    "unified_model.save(\"../../data/hmm_reuters_small.pkl\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that the HMM model relies on some informative priors to facilitate the parameter estimation:\n",
    "- the prior for the initial probabilities is a Dirichlet based on counts for the most reliable model (chosen right now to be the Spacy NER model trained on Ontonotes)\n",
    "- the prior for the transition matrix is a list of Dirichlet also based on counts from the standard Spacy NER model.\n",
    "- finally, the initial emission models are calculated based on subjective estimates of the relative precision and recall of each source. For instance, we know that a source like `company_type_detector` (which looks at legal suffixes such as \"Inc.\" at the end of the noun phrase) has a very high precision, but a low recall , since many mentions of companies do not include a suffix. In contrast, gazeteers will tend to have a better recall, but a lower precision (some company names also happen to be names of geopolitical entities or persons).  The initial precisions and recalls provided to the model are specified in `SOURCE_PRIORS` in the file `labelling.py`. When a precision and recall is not provided for a given source, they are assumed to be zeros (for instance, `company_type_detector` only detects `COMPANY` entities and nothing else).  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "#unified_model.pretty_print() "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Once the model is learned, we can apply it as any other \"annotator\" object:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><style>\n",
       ".tooltip {  position: relative;  border-bottom: 1px dotted black; }\n",
       ".tooltip .tooltip-text {visibility: hidden;  background-color: black;  color: white;\n",
       "                        line-height: 1.2;  text-align: right;  border-radius: 6px;\n",
       "                        padding: 5px 0; position: absolute; z-index: 1; margin-left:1em;\n",
       "                        opacity: 0; transition: opacity 1s;}\n",
       ".tooltip .tooltip-text::after {position: absolute; top: 1.5em; right: 100%; margin-top: -5px;\n",
       "                               border-width: 5px; border-style: solid; \n",
       "                               border-color: transparent black transparent transparent;}\n",
       ".tooltip:hover .tooltip-text {visibility: visible; opacity: 1;}\n",
       "</style>\n",
       "<div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"><label class='tooltip'>Best<span class='tooltip-text' style='width:231px'>doc_history_cased:\tORG&nbsp;&nbsp</span></label> buy offers used \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhones<span class='tooltip-text' style='width:259px'>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>wiki_cased:\tMISC&nbsp;&nbsp<br>wiki_uncased:\tMISC&nbsp;&nbsp<br>wiki_small_cased:\tMISC&nbsp;&nbsp<br>wiki_small_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp<br>doc_majority_cased:\tMISC&nbsp;&nbsp<br>doc_majority_uncased:\tMISC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       " at lower price</br>\n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>ATLANTA<span class='tooltip-text' style='width:301px'>spacy:\tLOC&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tLOC&nbsp;&nbsp<br>BTC_truecase:\tLOC&nbsp;&nbsp<br>edited_BTC_truecase:\tLOC&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tLOC&nbsp;&nbsp<br>wiki_uncased:\tLOC&nbsp;&nbsp<br>wiki_small_uncased:\tLOC&nbsp;&nbsp<br>geo_uncased:\tLOC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       " (\n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Reuters<span class='tooltip-text' style='width:301px'>gazetteer:\tORG&nbsp;&nbsp<br>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>BTC:\tORG&nbsp;&nbsp<br>BTC_truecase:\tORG&nbsp;&nbsp<br>edited_BTC:\tORG&nbsp;&nbsp<br>edited_BTC_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>wiki_cased:\tORG&nbsp;&nbsp<br>wiki_uncased:\tORG&nbsp;&nbsp<br>wiki_small_cased:\tORG&nbsp;&nbsp<br>wiki_small_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ") - \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Retailer<span class='tooltip-text' style='width:259px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Best<span class='tooltip-text' style='width:259px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Buy<span class='tooltip-text' style='width:259px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Co<span class='tooltip-text' style='width:301px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", seeking new ways to appeal to cost-conscious shoppers, said <label class='tooltip'>on<span class='tooltip-text' style='width:154px'>snips:\tDATE&nbsp;&nbsp</span></label> <label class='tooltip'>Tuesday<span class='tooltip-text' style='width:308px'>spacy:\tDATE&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>snips:\tDATE&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp<br>core_web_md_truecase:\tDATE&nbsp;&nbsp<br>edited_core_web_md:\tDATE&nbsp;&nbsp<br>edited_core_web_md_truecase:\tDATE&nbsp;&nbsp</span></label> it is selling refurbished versions of \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Apple<span class='tooltip-text' style='width:308px'>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>infrequent_compound_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>BTC:\tORG&nbsp;&nbsp<br>BTC_truecase:\tORG&nbsp;&nbsp<br>edited_BTC:\tORG&nbsp;&nbsp<br>edited_BTC_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Inc<span class='tooltip-text' style='width:308px'>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>infrequent_compound_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp</span></label><label class='tooltip'>'s<span class='tooltip-text' style='width:301px'>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhone<span class='tooltip-text' style='width:301px'>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>3<span class='tooltip-text' style='width:301px'>number_detector:\tQUANTITY&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label><label class='tooltip'>G<span class='tooltip-text' style='width:301px'>number_detector:\tQUANTITY&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " at its stores that are priced <label class='tooltip'>about<span class='tooltip-text' style='width:315px'>spacy:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp<br>core_web_md_truecase:\tMONEY&nbsp;&nbsp<br>edited_core_web_md:\tMONEY&nbsp;&nbsp<br>edited_core_web_md_truecase:\tMONEY&nbsp;&nbsp</span></label> <label class='tooltip'>$<span class='tooltip-text' style='width:315px'>spacy:\tMONEY&nbsp;&nbsp<br>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp<br>core_web_md_truecase:\tMONEY&nbsp;&nbsp<br>edited_core_web_md:\tMONEY&nbsp;&nbsp<br>edited_core_web_md_truecase:\tMONEY&nbsp;&nbsp</span></label><label class='tooltip'>50<span class='tooltip-text' style='width:315px'>spacy:\tMONEY&nbsp;&nbsp<br>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp<br>core_web_md_truecase:\tMONEY&nbsp;&nbsp<br>edited_core_web_md:\tMONEY&nbsp;&nbsp<br>edited_core_web_md_truecase:\tMONEY&nbsp;&nbsp</span></label> less than new \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhones<span class='tooltip-text' style='width:301px'>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>wiki_cased:\tMISC&nbsp;&nbsp<br>wiki_uncased:\tMISC&nbsp;&nbsp<br>wiki_small_cased:\tMISC&nbsp;&nbsp<br>wiki_small_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp<br>doc_majority_cased:\tMISC&nbsp;&nbsp<br>doc_majority_uncased:\tMISC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       ". The electronics chain said the used \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhones<span class='tooltip-text' style='width:301px'>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>wiki_cased:\tMISC&nbsp;&nbsp<br>wiki_uncased:\tMISC&nbsp;&nbsp<br>wiki_small_cased:\tMISC&nbsp;&nbsp<br>wiki_small_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp<br>doc_majority_cased:\tMISC&nbsp;&nbsp<br>doc_majority_uncased:\tMISC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       ", which were returned <label class='tooltip'>within<span class='tooltip-text' style='width:154px'>snips:\tTIME&nbsp;&nbsp</span></label> <label class='tooltip'>30<span class='tooltip-text' style='width:308px'>spacy:\tDATE&nbsp;&nbsp<br>number_detector:\tCARDINAL&nbsp;&nbsp<br>snips:\tTIME&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp<br>core_web_md_truecase:\tDATE&nbsp;&nbsp<br>edited_core_web_md:\tDATE&nbsp;&nbsp<br>edited_core_web_md_truecase:\tDATE&nbsp;&nbsp</span></label> <label class='tooltip'>days<span class='tooltip-text' style='width:308px'>spacy:\tDATE&nbsp;&nbsp<br>snips:\tTIME&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp<br>core_web_md_truecase:\tDATE&nbsp;&nbsp<br>edited_core_web_md:\tDATE&nbsp;&nbsp<br>edited_core_web_md_truecase:\tDATE&nbsp;&nbsp</span></label> of purchase, are priced at <label class='tooltip'>$<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp</span></label><label class='tooltip'>149<span class='tooltip-text' style='width:315px'>spacy:\tMONEY&nbsp;&nbsp<br>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp<br>core_web_md_truecase:\tMONEY&nbsp;&nbsp<br>edited_core_web_md:\tMONEY&nbsp;&nbsp<br>edited_core_web_md_truecase:\tMONEY&nbsp;&nbsp</span></label> for the model with <label class='tooltip'>8<span class='tooltip-text' style='width:336px'>spacy:\tCARDINAL&nbsp;&nbsp<br>number_detector:\tQUANTITY&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp<br>core_web_md:\tQUANTITY&nbsp;&nbsp<br>core_web_md_truecase:\tQUANTITY&nbsp;&nbsp<br>edited_core_web_md:\tQUANTITY&nbsp;&nbsp<br>edited_core_web_md_truecase:\tQUANTITY&nbsp;&nbsp</span></label> <label class='tooltip'>gigabytes<span class='tooltip-text' style='width:336px'>number_detector:\tQUANTITY&nbsp;&nbsp<br>core_web_md:\tQUANTITY&nbsp;&nbsp<br>core_web_md_truecase:\tQUANTITY&nbsp;&nbsp<br>edited_core_web_md:\tQUANTITY&nbsp;&nbsp<br>edited_core_web_md_truecase:\tQUANTITY&nbsp;&nbsp</span></label> of storage, while <label class='tooltip'>the<span class='tooltip-text' style='width:154px'>snips:\tDATE&nbsp;&nbsp</span></label> <label class='tooltip'>16<span class='tooltip-text' style='width:336px'>spacy:\tQUANTITY&nbsp;&nbsp<br>number_detector:\tCARDINAL&nbsp;&nbsp<br>snips:\tDATE&nbsp;&nbsp<br>core_web_md:\tQUANTITY&nbsp;&nbsp<br>core_web_md_truecase:\tQUANTITY&nbsp;&nbsp<br>edited_core_web_md:\tQUANTITY&nbsp;&nbsp<br>edited_core_web_md_truecase:\tQUANTITY&nbsp;&nbsp</span></label><label class='tooltip'>-<span class='tooltip-text' style='width:336px'>spacy:\tQUANTITY&nbsp;&nbsp<br>core_web_md:\tQUANTITY&nbsp;&nbsp<br>core_web_md_truecase:\tQUANTITY&nbsp;&nbsp<br>edited_core_web_md:\tQUANTITY&nbsp;&nbsp<br>edited_core_web_md_truecase:\tQUANTITY&nbsp;&nbsp</span></label><label class='tooltip'>gigabyte<span class='tooltip-text' style='width:336px'>spacy:\tQUANTITY&nbsp;&nbsp<br>core_web_md:\tQUANTITY&nbsp;&nbsp<br>core_web_md_truecase:\tQUANTITY&nbsp;&nbsp<br>edited_core_web_md:\tQUANTITY&nbsp;&nbsp<br>edited_core_web_md_truecase:\tQUANTITY&nbsp;&nbsp</span></label> version is <label class='tooltip'>$<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp</span></label><label class='tooltip'>249<span class='tooltip-text' style='width:315px'>spacy:\tMONEY&nbsp;&nbsp<br>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp<br>core_web_md_truecase:\tMONEY&nbsp;&nbsp<br>edited_core_web_md:\tMONEY&nbsp;&nbsp<br>edited_core_web_md_truecase:\tMONEY&nbsp;&nbsp</span></label>. A <label class='tooltip'>two<span class='tooltip-text' style='width:308px'>spacy:\tDATE&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp<br>core_web_md_truecase:\tDATE&nbsp;&nbsp<br>edited_core_web_md:\tDATE&nbsp;&nbsp<br>edited_core_web_md_truecase:\tDATE&nbsp;&nbsp</span></label><label class='tooltip'>-<span class='tooltip-text' style='width:308px'>spacy:\tDATE&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp<br>core_web_md_truecase:\tDATE&nbsp;&nbsp<br>edited_core_web_md:\tDATE&nbsp;&nbsp<br>edited_core_web_md_truecase:\tDATE&nbsp;&nbsp</span></label><label class='tooltip'>year<span class='tooltip-text' style='width:308px'>spacy:\tDATE&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp<br>core_web_md_truecase:\tDATE&nbsp;&nbsp<br>edited_core_web_md:\tDATE&nbsp;&nbsp<br>edited_core_web_md_truecase:\tDATE&nbsp;&nbsp</span></label> service contract with \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>AT&amp;T<span class='tooltip-text' style='width:308px'>company_detector:\tORG&nbsp;&nbsp<br>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>infrequent_compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Inc<span class='tooltip-text' style='width:308px'>company_detector:\tORG&nbsp;&nbsp<br>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>infrequent_compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " is required. <label class='tooltip'>New<span class='tooltip-text' style='width:196px'>nnp_detector:\tENT&nbsp;&nbsp</span></label> <label class='tooltip'>iPhone<span class='tooltip-text' style='width:231px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp</span></label> <label class='tooltip'>3Gs<span class='tooltip-text' style='width:252px'>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>number_detector:\tCARDINAL&nbsp;&nbsp</span></label> currently sell for <label class='tooltip'>$<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp</span></label><label class='tooltip'>199<span class='tooltip-text' style='width:315px'>spacy:\tMONEY&nbsp;&nbsp<br>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp<br>core_web_md_truecase:\tMONEY&nbsp;&nbsp<br>edited_core_web_md:\tMONEY&nbsp;&nbsp<br>edited_core_web_md_truecase:\tMONEY&nbsp;&nbsp</span></label> and <label class='tooltip'>$<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp</span></label><label class='tooltip'>299<span class='tooltip-text' style='width:315px'>spacy:\tMONEY&nbsp;&nbsp<br>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp<br>core_web_md_truecase:\tMONEY&nbsp;&nbsp<br>edited_core_web_md:\tMONEY&nbsp;&nbsp<br>edited_core_web_md_truecase:\tMONEY&nbsp;&nbsp</span></label> at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Best<span class='tooltip-text' style='width:252px'>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Buy<span class='tooltip-text' style='width:252px'>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Mobile<span class='tooltip-text' style='width:252px'>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Scott<span class='tooltip-text' style='width:315px'>spacy:\tPER&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>full_name_detector:\tPER&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp<br>core_web_md_truecase:\tPER&nbsp;&nbsp<br>BTC:\tPER&nbsp;&nbsp<br>BTC_truecase:\tPER&nbsp;&nbsp<br>edited_BTC:\tPER&nbsp;&nbsp<br>edited_BTC_truecase:\tPER&nbsp;&nbsp<br>edited_core_web_md:\tPER&nbsp;&nbsp<br>edited_core_web_md_truecase:\tPER&nbsp;&nbsp<br>wiki_cased:\tPER&nbsp;&nbsp<br>wiki_uncased:\tPER&nbsp;&nbsp<br>multitoken_wiki_cased:\tPER&nbsp;&nbsp<br>multitoken_wiki_uncased:\tPER&nbsp;&nbsp<br>wiki_small_cased:\tPER&nbsp;&nbsp<br>wiki_small_uncased:\tPER&nbsp;&nbsp<br>multitoken_wiki_small_cased:\tPER&nbsp;&nbsp<br>multitoken_wiki_small_uncased:\tPER&nbsp;&nbsp<br>crunchbase_cased:\tPER&nbsp;&nbsp<br>crunchbase_uncased:\tPER&nbsp;&nbsp<br>multitoken_crunchbase_cased:\tPER&nbsp;&nbsp<br>multitoken_crunchbase_uncased:\tPER&nbsp;&nbsp</span></label> <label class='tooltip'>Moore<span class='tooltip-text' style='width:315px'>spacy:\tPER&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>full_name_detector:\tPER&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp<br>core_web_md_truecase:\tPER&nbsp;&nbsp<br>BTC:\tPER&nbsp;&nbsp<br>BTC_truecase:\tPER&nbsp;&nbsp<br>edited_BTC:\tPER&nbsp;&nbsp<br>edited_BTC_truecase:\tPER&nbsp;&nbsp<br>edited_core_web_md:\tPER&nbsp;&nbsp<br>edited_core_web_md_truecase:\tPER&nbsp;&nbsp<br>wiki_cased:\tPER&nbsp;&nbsp<br>wiki_uncased:\tPER&nbsp;&nbsp<br>multitoken_wiki_cased:\tPER&nbsp;&nbsp<br>multitoken_wiki_uncased:\tPER&nbsp;&nbsp<br>wiki_small_cased:\tPER&nbsp;&nbsp<br>wiki_small_uncased:\tPER&nbsp;&nbsp<br>multitoken_wiki_small_cased:\tPER&nbsp;&nbsp<br>multitoken_wiki_small_uncased:\tPER&nbsp;&nbsp<br>crunchbase_cased:\tPER&nbsp;&nbsp<br>crunchbase_uncased:\tPER&nbsp;&nbsp<br>multitoken_crunchbase_cased:\tPER&nbsp;&nbsp<br>multitoken_crunchbase_uncased:\tPER&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       ", <label class='tooltip'>vice<span class='tooltip-text' style='width:231px'>products_uncased:\tMISC&nbsp;&nbsp</span></label> president of marketing for \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Best<span class='tooltip-text' style='width:252px'>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_history_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Buy<span class='tooltip-text' style='width:252px'>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_history_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Mobile<span class='tooltip-text' style='width:252px'>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_history_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ". Buyers of <label class='tooltip'>first<span class='tooltip-text' style='width:329px'>spacy:\tORDINAL&nbsp;&nbsp<br>core_web_md:\tORDINAL&nbsp;&nbsp<br>core_web_md_truecase:\tORDINAL&nbsp;&nbsp<br>edited_core_web_md:\tORDINAL&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORDINAL&nbsp;&nbsp</span></label>-generation \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhones<span class='tooltip-text' style='width:231px'>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " can also upgrade to the faster refurbished <label class='tooltip'>3<span class='tooltip-text' style='width:336px'>spacy:\tCARDINAL&nbsp;&nbsp<br>number_detector:\tQUANTITY&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp<br>core_web_md:\tCARDINAL&nbsp;&nbsp<br>core_web_md_truecase:\tCARDINAL&nbsp;&nbsp<br>edited_core_web_md:\tCARDINAL&nbsp;&nbsp<br>edited_core_web_md_truecase:\tCARDINAL&nbsp;&nbsp</span></label><label class='tooltip'>G<span class='tooltip-text' style='width:252px'>number_detector:\tQUANTITY&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp</span></label> models at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Best<span class='tooltip-text' style='width:315px'>gazetteer:\tORG&nbsp;&nbsp<br>spacy:\tMISC&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>wiki_cased:\tORG&nbsp;&nbsp<br>wiki_uncased:\tORG&nbsp;&nbsp<br>multitoken_wiki_cased:\tORG&nbsp;&nbsp<br>multitoken_wiki_uncased:\tORG&nbsp;&nbsp<br>wiki_small_cased:\tORG&nbsp;&nbsp<br>wiki_small_uncased:\tORG&nbsp;&nbsp<br>multitoken_wiki_small_cased:\tORG&nbsp;&nbsp<br>multitoken_wiki_small_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>multitoken_crunchbase_cased:\tORG&nbsp;&nbsp<br>multitoken_crunchbase_uncased:\tORG&nbsp;&nbsp<br>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_history_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Buy<span class='tooltip-text' style='width:315px'>gazetteer:\tORG&nbsp;&nbsp<br>spacy:\tMISC&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>wiki_cased:\tORG&nbsp;&nbsp<br>wiki_uncased:\tORG&nbsp;&nbsp<br>multitoken_wiki_cased:\tORG&nbsp;&nbsp<br>multitoken_wiki_uncased:\tORG&nbsp;&nbsp<br>wiki_small_cased:\tORG&nbsp;&nbsp<br>wiki_small_uncased:\tORG&nbsp;&nbsp<br>multitoken_wiki_small_cased:\tORG&nbsp;&nbsp<br>multitoken_wiki_small_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>multitoken_crunchbase_cased:\tORG&nbsp;&nbsp<br>multitoken_crunchbase_uncased:\tORG&nbsp;&nbsp<br>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_history_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", he said\n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>.<span class='tooltip-text' style='width:245px'>BTC:\tPER&nbsp;&nbsp<br>BTC_truecase:\tPER&nbsp;&nbsp<br>edited_BTC:\tPER&nbsp;&nbsp<br>edited_BTC_truecase:\tPER&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       " \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Moore<span class='tooltip-text' style='width:301px'>gazetteer:\tORG&nbsp;&nbsp<br>spacy:\tORG&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp<br>core_web_md_truecase:\tPER&nbsp;&nbsp<br>BTC:\tPER&nbsp;&nbsp<br>BTC_truecase:\tPER&nbsp;&nbsp<br>edited_BTC:\tPER&nbsp;&nbsp<br>edited_BTC_truecase:\tPER&nbsp;&nbsp<br>edited_core_web_md:\tPER&nbsp;&nbsp<br>edited_core_web_md_truecase:\tPER&nbsp;&nbsp<br>doc_history_cased:\tPER&nbsp;&nbsp<br>doc_majority_cased:\tPER&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>AT&amp;T<span class='tooltip-text' style='width:301px'>gazetteer:\tORG&nbsp;&nbsp<br>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>wiki_cased:\tLOC&nbsp;&nbsp<br>wiki_cased:\tORG&nbsp;&nbsp<br>wiki_uncased:\tLOC&nbsp;&nbsp<br>wiki_uncased:\tORG&nbsp;&nbsp<br>wiki_small_cased:\tORG&nbsp;&nbsp<br>wiki_small_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_history_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", the exclusive wireless provider for the \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhone<span class='tooltip-text' style='width:301px'>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>wiki_cased:\tMISC&nbsp;&nbsp<br>wiki_uncased:\tMISC&nbsp;&nbsp<br>wiki_small_cased:\tMISC&nbsp;&nbsp<br>wiki_small_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp<br>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_history_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       ", offers refurbished \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhones<span class='tooltip-text' style='width:259px'>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>wiki_cased:\tMISC&nbsp;&nbsp<br>wiki_uncased:\tMISC&nbsp;&nbsp<br>wiki_small_cased:\tMISC&nbsp;&nbsp<br>wiki_small_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp<br>doc_majority_cased:\tMISC&nbsp;&nbsp<br>doc_majority_uncased:\tMISC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       " online. The sale of used \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhones<span class='tooltip-text' style='width:308px'>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tMISC&nbsp;&nbsp<br>core_web_md_truecase:\tMISC&nbsp;&nbsp<br>edited_core_web_md:\tMISC&nbsp;&nbsp<br>edited_core_web_md_truecase:\tMISC&nbsp;&nbsp<br>wiki_cased:\tMISC&nbsp;&nbsp<br>wiki_uncased:\tMISC&nbsp;&nbsp<br>wiki_small_cased:\tMISC&nbsp;&nbsp<br>wiki_small_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp<br>doc_majority_cased:\tMISC&nbsp;&nbsp<br>doc_majority_uncased:\tMISC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       " comes as \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Best<span class='tooltip-text' style='width:315px'>gazetteer:\tORG&nbsp;&nbsp<br>spacy:\tPER&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>wiki_cased:\tORG&nbsp;&nbsp<br>wiki_uncased:\tORG&nbsp;&nbsp<br>multitoken_wiki_cased:\tORG&nbsp;&nbsp<br>multitoken_wiki_uncased:\tORG&nbsp;&nbsp<br>wiki_small_cased:\tORG&nbsp;&nbsp<br>wiki_small_uncased:\tORG&nbsp;&nbsp<br>multitoken_wiki_small_cased:\tORG&nbsp;&nbsp<br>multitoken_wiki_small_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>multitoken_crunchbase_cased:\tORG&nbsp;&nbsp<br>multitoken_crunchbase_uncased:\tORG&nbsp;&nbsp<br>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_history_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Buy<span class='tooltip-text' style='width:315px'>gazetteer:\tORG&nbsp;&nbsp<br>spacy:\tPER&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>wiki_cased:\tORG&nbsp;&nbsp<br>wiki_uncased:\tORG&nbsp;&nbsp<br>multitoken_wiki_cased:\tORG&nbsp;&nbsp<br>multitoken_wiki_uncased:\tORG&nbsp;&nbsp<br>wiki_small_cased:\tORG&nbsp;&nbsp<br>wiki_small_uncased:\tORG&nbsp;&nbsp<br>multitoken_wiki_small_cased:\tORG&nbsp;&nbsp<br>multitoken_wiki_small_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>multitoken_crunchbase_cased:\tORG&nbsp;&nbsp<br>multitoken_crunchbase_uncased:\tORG&nbsp;&nbsp<br>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_history_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", the top consumer electronics chain, seeks ways to <label class='tooltip'>fend<span class='tooltip-text' style='width:238px'>crunchbase_uncased:\tORG&nbsp;&nbsp</span></label> off increased competition from discounters such as \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Wal<span class='tooltip-text' style='width:308px'>company_detector:\tORG&nbsp;&nbsp<br>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>infrequent_compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>BTC:\tORG&nbsp;&nbsp<br>BTC_truecase:\tORG&nbsp;&nbsp<br>edited_BTC:\tORG&nbsp;&nbsp<br>edited_BTC_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp</span></label><label class='tooltip'>-<span class='tooltip-text' style='width:308px'>company_detector:\tORG&nbsp;&nbsp<br>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>infrequent_compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp</span></label><label class='tooltip'>Mart<span class='tooltip-text' style='width:308px'>company_detector:\tORG&nbsp;&nbsp<br>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>infrequent_compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Stores<span class='tooltip-text' style='width:308px'>company_detector:\tORG&nbsp;&nbsp<br>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>infrequent_compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Inc<span class='tooltip-text' style='width:308px'>company_detector:\tORG&nbsp;&nbsp<br>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>infrequent_compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", which began selling the popular phone <label class='tooltip'>late<span class='tooltip-text' style='width:308px'>spacy:\tDATE&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp<br>core_web_md_truecase:\tDATE&nbsp;&nbsp<br>edited_core_web_md:\tDATE&nbsp;&nbsp<br>edited_core_web_md_truecase:\tDATE&nbsp;&nbsp</span></label> <label class='tooltip'>last<span class='tooltip-text' style='width:308px'>spacy:\tDATE&nbsp;&nbsp<br>snips:\tDATE&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp<br>core_web_md_truecase:\tDATE&nbsp;&nbsp<br>edited_core_web_md:\tDATE&nbsp;&nbsp<br>edited_core_web_md_truecase:\tDATE&nbsp;&nbsp</span></label> <label class='tooltip'>month<span class='tooltip-text' style='width:308px'>spacy:\tDATE&nbsp;&nbsp<br>snips:\tDATE&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp<br>core_web_md_truecase:\tDATE&nbsp;&nbsp<br>edited_core_web_md:\tDATE&nbsp;&nbsp<br>edited_core_web_md_truecase:\tDATE&nbsp;&nbsp</span></label>. \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Wal<span class='tooltip-text' style='width:301px'>spacy:\tORG&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_history_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label><label class='tooltip'>-<span class='tooltip-text' style='width:301px'>spacy:\tORG&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_history_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label><label class='tooltip'>Mart<span class='tooltip-text' style='width:301px'>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>BTC:\tPER&nbsp;&nbsp<br>BTC_truecase:\tPER&nbsp;&nbsp<br>edited_BTC:\tPER&nbsp;&nbsp<br>edited_BTC_truecase:\tPER&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_history_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " sells a new <label class='tooltip'>8<span class='tooltip-text' style='width:336px'>spacy:\tQUANTITY&nbsp;&nbsp<br>number_detector:\tCARDINAL&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp<br>core_web_md:\tQUANTITY&nbsp;&nbsp<br>core_web_md_truecase:\tQUANTITY&nbsp;&nbsp<br>edited_core_web_md:\tQUANTITY&nbsp;&nbsp<br>edited_core_web_md_truecase:\tQUANTITY&nbsp;&nbsp</span></label><label class='tooltip'>-<span class='tooltip-text' style='width:336px'>spacy:\tQUANTITY&nbsp;&nbsp<br>core_web_md:\tQUANTITY&nbsp;&nbsp<br>core_web_md_truecase:\tQUANTITY&nbsp;&nbsp<br>edited_core_web_md:\tQUANTITY&nbsp;&nbsp<br>edited_core_web_md_truecase:\tQUANTITY&nbsp;&nbsp</span></label><label class='tooltip'>gigabyte<span class='tooltip-text' style='width:336px'>spacy:\tQUANTITY&nbsp;&nbsp<br>core_web_md:\tQUANTITY&nbsp;&nbsp<br>core_web_md_truecase:\tQUANTITY&nbsp;&nbsp<br>edited_core_web_md:\tQUANTITY&nbsp;&nbsp<br>edited_core_web_md_truecase:\tQUANTITY&nbsp;&nbsp</span></label> \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhone<span class='tooltip-text' style='width:301px'>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_history_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>3<span class='tooltip-text' style='width:301px'>spacy:\tORG&nbsp;&nbsp<br>number_detector:\tQUANTITY&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_history_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label><label class='tooltip'>G<span class='tooltip-text' style='width:301px'>spacy:\tORG&nbsp;&nbsp<br>number_detector:\tQUANTITY&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_history_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " for <label class='tooltip'>$<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp</span></label><label class='tooltip'>197<span class='tooltip-text' style='width:315px'>spacy:\tMONEY&nbsp;&nbsp<br>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp<br>core_web_md_truecase:\tMONEY&nbsp;&nbsp<br>edited_core_web_md:\tMONEY&nbsp;&nbsp<br>edited_core_web_md_truecase:\tMONEY&nbsp;&nbsp</span></label> and <label class='tooltip'>$<span class='tooltip-text' style='width:224px'>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp</span></label><label class='tooltip'>297<span class='tooltip-text' style='width:315px'>spacy:\tMONEY&nbsp;&nbsp<br>money_detector:\tMONEY&nbsp;&nbsp<br>snips:\tMONEY&nbsp;&nbsp<br>core_web_md:\tMONEY&nbsp;&nbsp<br>core_web_md_truecase:\tMONEY&nbsp;&nbsp<br>edited_core_web_md:\tMONEY&nbsp;&nbsp<br>edited_core_web_md_truecase:\tMONEY&nbsp;&nbsp</span></label> <label class='tooltip'>for<span class='tooltip-text' style='width:154px'>snips:\tTIME&nbsp;&nbsp</span></label> <label class='tooltip'>the<span class='tooltip-text' style='width:154px'>snips:\tTIME&nbsp;&nbsp</span></label> <label class='tooltip'>16<span class='tooltip-text' style='width:336px'>spacy:\tQUANTITY&nbsp;&nbsp<br>number_detector:\tCARDINAL&nbsp;&nbsp<br>snips:\tTIME&nbsp;&nbsp<br>core_web_md:\tQUANTITY&nbsp;&nbsp<br>core_web_md_truecase:\tQUANTITY&nbsp;&nbsp<br>edited_core_web_md:\tQUANTITY&nbsp;&nbsp<br>edited_core_web_md_truecase:\tQUANTITY&nbsp;&nbsp</span></label><label class='tooltip'>-<span class='tooltip-text' style='width:336px'>spacy:\tQUANTITY&nbsp;&nbsp<br>core_web_md:\tQUANTITY&nbsp;&nbsp<br>core_web_md_truecase:\tQUANTITY&nbsp;&nbsp<br>edited_core_web_md:\tQUANTITY&nbsp;&nbsp<br>edited_core_web_md_truecase:\tQUANTITY&nbsp;&nbsp</span></label><label class='tooltip'>gigabyte<span class='tooltip-text' style='width:336px'>spacy:\tQUANTITY&nbsp;&nbsp<br>core_web_md:\tQUANTITY&nbsp;&nbsp<br>core_web_md_truecase:\tQUANTITY&nbsp;&nbsp<br>edited_core_web_md:\tQUANTITY&nbsp;&nbsp<br>edited_core_web_md_truecase:\tQUANTITY&nbsp;&nbsp</span></label> model. The \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>iPhone<span class='tooltip-text' style='width:301px'>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>wiki_cased:\tMISC&nbsp;&nbsp<br>wiki_uncased:\tMISC&nbsp;&nbsp<br>wiki_small_cased:\tMISC&nbsp;&nbsp<br>wiki_small_uncased:\tMISC&nbsp;&nbsp<br>products_cased:\tMISC&nbsp;&nbsp<br>products_uncased:\tMISC&nbsp;&nbsp<br>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_history_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       " is also sold at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Apple<span class='tooltip-text' style='width:301px'>gazetteer:\tORG&nbsp;&nbsp<br>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>BTC:\tORG&nbsp;&nbsp<br>BTC_truecase:\tORG&nbsp;&nbsp<br>edited_BTC:\tORG&nbsp;&nbsp<br>edited_BTC_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>wiki_cased:\tORG&nbsp;&nbsp<br>wiki_uncased:\tORG&nbsp;&nbsp<br>wiki_small_cased:\tORG&nbsp;&nbsp<br>wiki_small_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_history_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores and \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>AT&amp;T<span class='tooltip-text' style='width:308px'>gazetteer:\tORG&nbsp;&nbsp<br>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>infrequent_compound_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>wiki_cased:\tLOC&nbsp;&nbsp<br>wiki_cased:\tORG&nbsp;&nbsp<br>wiki_uncased:\tLOC&nbsp;&nbsp<br>wiki_uncased:\tORG&nbsp;&nbsp<br>wiki_small_cased:\tORG&nbsp;&nbsp<br>wiki_small_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_history_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores. \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Moore<span class='tooltip-text' style='width:301px'>gazetteer:\tORG&nbsp;&nbsp<br>spacy:\tORG&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tPER&nbsp;&nbsp<br>BTC:\tPER&nbsp;&nbsp<br>BTC_truecase:\tPER&nbsp;&nbsp<br>edited_BTC:\tPER&nbsp;&nbsp<br>edited_BTC_truecase:\tPER&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tPER&nbsp;&nbsp<br>doc_history_cased:\tPER&nbsp;&nbsp<br>doc_majority_cased:\tPER&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Best<span class='tooltip-text' style='width:315px'>gazetteer:\tORG&nbsp;&nbsp<br>spacy:\tPER&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>wiki_cased:\tORG&nbsp;&nbsp<br>wiki_uncased:\tORG&nbsp;&nbsp<br>multitoken_wiki_cased:\tORG&nbsp;&nbsp<br>multitoken_wiki_uncased:\tORG&nbsp;&nbsp<br>wiki_small_cased:\tORG&nbsp;&nbsp<br>wiki_small_uncased:\tORG&nbsp;&nbsp<br>multitoken_wiki_small_cased:\tORG&nbsp;&nbsp<br>multitoken_wiki_small_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>multitoken_crunchbase_cased:\tORG&nbsp;&nbsp<br>multitoken_crunchbase_uncased:\tORG&nbsp;&nbsp<br>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_history_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Buy<span class='tooltip-text' style='width:315px'>gazetteer:\tORG&nbsp;&nbsp<br>spacy:\tPER&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>wiki_cased:\tORG&nbsp;&nbsp<br>wiki_uncased:\tORG&nbsp;&nbsp<br>multitoken_wiki_cased:\tORG&nbsp;&nbsp<br>multitoken_wiki_uncased:\tORG&nbsp;&nbsp<br>wiki_small_cased:\tORG&nbsp;&nbsp<br>wiki_small_uncased:\tORG&nbsp;&nbsp<br>multitoken_wiki_small_cased:\tORG&nbsp;&nbsp<br>multitoken_wiki_small_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>multitoken_crunchbase_cased:\tORG&nbsp;&nbsp<br>multitoken_crunchbase_uncased:\tORG&nbsp;&nbsp<br>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_history_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       "<label class='tooltip'>'s<span class='tooltip-text' style='width:147px'>spacy:\tPER&nbsp;&nbsp</span></label> move was not in response to other retailers' actions. </div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><style>\n",
       ".tooltip {  position: relative;  border-bottom: 1px dotted black; }\n",
       ".tooltip .tooltip-text {visibility: hidden;  background-color: black;  color: white;\n",
       "                        line-height: 1.2;  text-align: right;  border-radius: 6px;\n",
       "                        padding: 5px 0; position: absolute; z-index: 1; margin-left:1em;\n",
       "                        opacity: 0; transition: opacity 1s;}\n",
       ".tooltip .tooltip-text::after {position: absolute; top: 1.5em; right: 100%; margin-top: -5px;\n",
       "                               border-width: 5px; border-style: solid; \n",
       "                               border-color: transparent black transparent transparent;}\n",
       ".tooltip:hover .tooltip-text {visibility: visible; opacity: 1;}\n",
       "</style>\n",
       "<div class=\"entities\" style=\"line-height: 2.5; direction: ltr\">\n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Goody<span class='tooltip-text' style='width:273px'>gazetteer:\tORG&nbsp;&nbsp<br>spacy:\tPER&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       "<label class='tooltip'>'s<span class='tooltip-text' style='width:238px'>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp</span></label> family clothing to liquidate stores</br>\n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>NEW<span class='tooltip-text' style='width:315px'>spacy:\tLOC&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>infrequent_compound_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tLOC&nbsp;&nbsp<br>core_web_md_truecase:\tLOC&nbsp;&nbsp<br>edited_core_web_md:\tLOC&nbsp;&nbsp<br>edited_core_web_md_truecase:\tLOC&nbsp;&nbsp<br>wiki_uncased:\tLOC&nbsp;&nbsp<br>multitoken_wiki_uncased:\tLOC&nbsp;&nbsp<br>wiki_small_uncased:\tLOC&nbsp;&nbsp<br>multitoken_wiki_small_uncased:\tLOC&nbsp;&nbsp<br>geo_uncased:\tLOC&nbsp;&nbsp<br>multitoken_geo_uncased:\tLOC&nbsp;&nbsp</span></label> <label class='tooltip'>YORK<span class='tooltip-text' style='width:315px'>spacy:\tLOC&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>infrequent_compound_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tLOC&nbsp;&nbsp<br>core_web_md_truecase:\tLOC&nbsp;&nbsp<br>edited_core_web_md:\tLOC&nbsp;&nbsp<br>edited_core_web_md_truecase:\tLOC&nbsp;&nbsp<br>wiki_uncased:\tLOC&nbsp;&nbsp<br>multitoken_wiki_uncased:\tLOC&nbsp;&nbsp<br>wiki_small_uncased:\tLOC&nbsp;&nbsp<br>multitoken_wiki_small_uncased:\tLOC&nbsp;&nbsp<br>geo_uncased:\tLOC&nbsp;&nbsp<br>multitoken_geo_uncased:\tLOC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       " (\n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Reuters<span class='tooltip-text' style='width:301px'>gazetteer:\tORG&nbsp;&nbsp<br>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>BTC:\tORG&nbsp;&nbsp<br>BTC_truecase:\tORG&nbsp;&nbsp<br>edited_BTC:\tORG&nbsp;&nbsp<br>edited_BTC_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>wiki_cased:\tORG&nbsp;&nbsp<br>wiki_uncased:\tORG&nbsp;&nbsp<br>wiki_small_cased:\tORG&nbsp;&nbsp<br>wiki_small_uncased:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ") - \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Goody<span class='tooltip-text' style='width:301px'>gazetteer:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_history_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label><label class='tooltip'>'s<span class='tooltip-text' style='width:301px'>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_history_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Family<span class='tooltip-text' style='width:301px'>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp</span></label> <label class='tooltip'>Clothing<span class='tooltip-text' style='width:301px'>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", a privately held apparel retail chain which emerged from bankruptcy <label class='tooltip'>in<span class='tooltip-text' style='width:154px'>snips:\tDATE&nbsp;&nbsp</span></label> \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>October<span class='tooltip-text' style='width:308px'>spacy:\tDATE&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>snips:\tDATE&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp<br>core_web_md_truecase:\tDATE&nbsp;&nbsp<br>edited_core_web_md:\tDATE&nbsp;&nbsp<br>edited_core_web_md_truecase:\tDATE&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", plans to liquidate its remaining stores as the \n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>U.S.<span class='tooltip-text' style='width:301px'>spacy:\tLOC&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tLOC&nbsp;&nbsp<br>core_web_md_truecase:\tLOC&nbsp;&nbsp<br>BTC:\tLOC&nbsp;&nbsp<br>BTC_truecase:\tLOC&nbsp;&nbsp<br>edited_BTC:\tLOC&nbsp;&nbsp<br>edited_BTC_truecase:\tLOC&nbsp;&nbsp<br>edited_core_web_md:\tLOC&nbsp;&nbsp<br>edited_core_web_md_truecase:\tLOC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       " economic recession has undermined its ability to continue operating. &quot;The company is in the processes of obtaining bids to liquidate substantially all collateral and inventory,&quot; said \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Cathy<span class='tooltip-text' style='width:308px'>spacy:\tPER&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>infrequent_compound_detector:\tENT&nbsp;&nbsp<br>full_name_detector:\tPER&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp<br>core_web_md_truecase:\tPER&nbsp;&nbsp<br>BTC:\tPER&nbsp;&nbsp<br>BTC_truecase:\tPER&nbsp;&nbsp<br>edited_BTC:\tPER&nbsp;&nbsp<br>edited_BTC_truecase:\tPER&nbsp;&nbsp<br>edited_core_web_md:\tPER&nbsp;&nbsp<br>edited_core_web_md_truecase:\tPER&nbsp;&nbsp</span></label> <label class='tooltip'>Hershcopf<span class='tooltip-text' style='width:308px'>spacy:\tPER&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>infrequent_compound_detector:\tENT&nbsp;&nbsp<br>full_name_detector:\tPER&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp<br>core_web_md_truecase:\tPER&nbsp;&nbsp<br>BTC:\tPER&nbsp;&nbsp<br>BTC_truecase:\tPER&nbsp;&nbsp<br>edited_BTC:\tPER&nbsp;&nbsp<br>edited_BTC_truecase:\tPER&nbsp;&nbsp<br>edited_core_web_md:\tPER&nbsp;&nbsp<br>edited_core_web_md_truecase:\tPER&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       ", a bankruptcy partner at law firm \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Cooley<span class='tooltip-text' style='width:308px'>spacy:\tPER&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>infrequent_compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp<br>core_web_md_truecase:\tPER&nbsp;&nbsp<br>BTC:\tPER&nbsp;&nbsp<br>BTC_truecase:\tPER&nbsp;&nbsp<br>edited_BTC:\tPER&nbsp;&nbsp<br>edited_BTC_truecase:\tPER&nbsp;&nbsp<br>edited_core_web_md:\tPER&nbsp;&nbsp<br>edited_core_web_md_truecase:\tPER&nbsp;&nbsp</span></label> <label class='tooltip'>Godward<span class='tooltip-text' style='width:308px'>spacy:\tPER&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>infrequent_compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp<br>core_web_md_truecase:\tPER&nbsp;&nbsp<br>BTC:\tPER&nbsp;&nbsp<br>BTC_truecase:\tPER&nbsp;&nbsp<br>edited_BTC:\tPER&nbsp;&nbsp<br>edited_BTC_truecase:\tPER&nbsp;&nbsp<br>edited_core_web_md:\tPER&nbsp;&nbsp<br>edited_core_web_md_truecase:\tPER&nbsp;&nbsp</span></label> <label class='tooltip'>Kronish<span class='tooltip-text' style='width:308px'>spacy:\tPER&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>infrequent_compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp<br>core_web_md_truecase:\tPER&nbsp;&nbsp<br>BTC:\tPER&nbsp;&nbsp<br>BTC_truecase:\tPER&nbsp;&nbsp<br>edited_BTC:\tPER&nbsp;&nbsp<br>edited_BTC_truecase:\tPER&nbsp;&nbsp<br>edited_core_web_md:\tPER&nbsp;&nbsp<br>edited_core_web_md_truecase:\tPER&nbsp;&nbsp</span></label> <label class='tooltip'>LLP<span class='tooltip-text' style='width:308px'>spacy:\tPER&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>infrequent_compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp<br>core_web_md_truecase:\tPER&nbsp;&nbsp<br>BTC:\tPER&nbsp;&nbsp<br>BTC_truecase:\tPER&nbsp;&nbsp<br>edited_BTC:\tPER&nbsp;&nbsp<br>edited_BTC_truecase:\tPER&nbsp;&nbsp<br>edited_core_web_md:\tPER&nbsp;&nbsp<br>edited_core_web_md_truecase:\tPER&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       ". &quot;The retail environment is very difficult and they did not have sufficient capital to weather the bad times.&quot; The law firm is acting as a liaison between \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Goody<span class='tooltip-text' style='width:301px'>gazetteer:\tORG&nbsp;&nbsp<br>spacy:\tPER&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label><label class='tooltip'>'s<span class='tooltip-text' style='width:301px'>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " vendors and the company and \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Prentice<span class='tooltip-text' style='width:315px'>gazetteer:\tORG&nbsp;&nbsp<br>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>BTC:\tORG&nbsp;&nbsp<br>BTC_truecase:\tORG&nbsp;&nbsp<br>edited_BTC:\tORG&nbsp;&nbsp<br>edited_BTC_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>multitoken_crunchbase_cased:\tORG&nbsp;&nbsp<br>multitoken_crunchbase_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Capital<span class='tooltip-text' style='width:315px'>gazetteer:\tORG&nbsp;&nbsp<br>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>BTC:\tORG&nbsp;&nbsp<br>BTC_truecase:\tORG&nbsp;&nbsp<br>edited_BTC:\tORG&nbsp;&nbsp<br>edited_BTC_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>multitoken_crunchbase_cased:\tORG&nbsp;&nbsp<br>multitoken_crunchbase_uncased:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Management<span class='tooltip-text' style='width:315px'>gazetteer:\tORG&nbsp;&nbsp<br>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>multitoken_crunchbase_cased:\tORG&nbsp;&nbsp<br>multitoken_crunchbase_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ". \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>PGDYS<span class='tooltip-text' style='width:308px'>company_detector:\tORG&nbsp;&nbsp<br>spacy:\tPER&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>infrequent_compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>Lending<span class='tooltip-text' style='width:308px'>company_detector:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>infrequent_compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>LLC<span class='tooltip-text' style='width:308px'>company_detector:\tORG&nbsp;&nbsp<br>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>infrequent_compound_detector:\tENT&nbsp;&nbsp<br>company_type_detector:\tORG&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>BTC:\tORG&nbsp;&nbsp<br>BTC_truecase:\tORG&nbsp;&nbsp<br>edited_BTC:\tORG&nbsp;&nbsp<br>edited_BTC_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " is \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Goody<span class='tooltip-text' style='width:301px'>gazetteer:\tORG&nbsp;&nbsp<br>spacy:\tPER&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_history_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label><label class='tooltip'>'s<span class='tooltip-text' style='width:252px'>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_history_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " parent company, and \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Prentice<span class='tooltip-text' style='width:301px'>spacy:\tLOC&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>BTC:\tPER&nbsp;&nbsp<br>BTC_truecase:\tPER&nbsp;&nbsp<br>edited_BTC:\tPER&nbsp;&nbsp<br>edited_BTC_truecase:\tPER&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_history_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " is the manager of \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>PGDYS<span class='tooltip-text' style='width:301px'>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tORG&nbsp;&nbsp<br>core_web_md_truecase:\tORG&nbsp;&nbsp<br>edited_core_web_md:\tORG&nbsp;&nbsp<br>edited_core_web_md_truecase:\tORG&nbsp;&nbsp<br>doc_history_cased:\tORG&nbsp;&nbsp<br>doc_history_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ". \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Bob<span class='tooltip-text' style='width:301px'>spacy:\tPER&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp<br>core_web_md_truecase:\tPER&nbsp;&nbsp<br>BTC:\tPER&nbsp;&nbsp<br>BTC_truecase:\tPER&nbsp;&nbsp<br>edited_BTC:\tPER&nbsp;&nbsp<br>edited_BTC_truecase:\tPER&nbsp;&nbsp<br>edited_core_web_md:\tPER&nbsp;&nbsp<br>edited_core_web_md_truecase:\tPER&nbsp;&nbsp</span></label> <label class='tooltip'>Carbonell<span class='tooltip-text' style='width:301px'>spacy:\tPER&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp<br>core_web_md_truecase:\tPER&nbsp;&nbsp<br>BTC:\tPER&nbsp;&nbsp<br>BTC_truecase:\tPER&nbsp;&nbsp<br>edited_BTC:\tPER&nbsp;&nbsp<br>edited_BTC_truecase:\tPER&nbsp;&nbsp<br>edited_core_web_md:\tPER&nbsp;&nbsp<br>edited_core_web_md_truecase:\tPER&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       ", chief credit officer for retail credit rating service \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Bernard<span class='tooltip-text' style='width:308px'>spacy:\tPER&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>infrequent_compound_detector:\tENT&nbsp;&nbsp<br>full_name_detector:\tPER&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp<br>core_web_md_truecase:\tPER&nbsp;&nbsp<br>BTC:\tPER&nbsp;&nbsp<br>BTC_truecase:\tPER&nbsp;&nbsp<br>edited_BTC:\tPER&nbsp;&nbsp<br>edited_BTC_truecase:\tPER&nbsp;&nbsp<br>edited_core_web_md:\tPER&nbsp;&nbsp<br>edited_core_web_md_truecase:\tPER&nbsp;&nbsp</span></label> <label class='tooltip'>Sands<span class='tooltip-text' style='width:308px'>spacy:\tPER&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>compound_detector:\tENT&nbsp;&nbsp<br>infrequent_compound_detector:\tENT&nbsp;&nbsp<br>full_name_detector:\tPER&nbsp;&nbsp<br>core_web_md:\tPER&nbsp;&nbsp<br>core_web_md_truecase:\tPER&nbsp;&nbsp<br>BTC:\tPER&nbsp;&nbsp<br>BTC_truecase:\tPER&nbsp;&nbsp<br>edited_BTC:\tPER&nbsp;&nbsp<br>edited_BTC_truecase:\tPER&nbsp;&nbsp<br>edited_core_web_md:\tPER&nbsp;&nbsp<br>edited_core_web_md_truecase:\tPER&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       ", confirmed that the company plans to liquidate. Going-out-of-business sales will <label class='tooltip'>begin<span class='tooltip-text' style='width:196px'>wiki_uncased:\tPER&nbsp;&nbsp</span></label> <label class='tooltip'>as<span class='tooltip-text' style='width:308px'>spacy:\tDATE&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp<br>core_web_md_truecase:\tDATE&nbsp;&nbsp<br>edited_core_web_md:\tDATE&nbsp;&nbsp<br>edited_core_web_md_truecase:\tDATE&nbsp;&nbsp</span></label> <label class='tooltip'>early<span class='tooltip-text' style='width:308px'>spacy:\tDATE&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp<br>core_web_md_truecase:\tDATE&nbsp;&nbsp<br>edited_core_web_md:\tDATE&nbsp;&nbsp<br>edited_core_web_md_truecase:\tDATE&nbsp;&nbsp</span></label> <label class='tooltip'>as<span class='tooltip-text' style='width:308px'>spacy:\tDATE&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp<br>core_web_md_truecase:\tDATE&nbsp;&nbsp<br>edited_core_web_md:\tDATE&nbsp;&nbsp<br>edited_core_web_md_truecase:\tDATE&nbsp;&nbsp</span></label> \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Friday<span class='tooltip-text' style='width:308px'>spacy:\tDATE&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>snips:\tDATE&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp<br>core_web_md_truecase:\tDATE&nbsp;&nbsp<br>edited_core_web_md:\tDATE&nbsp;&nbsp<br>edited_core_web_md_truecase:\tDATE&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       "<label class='tooltip'>,<span class='tooltip-text' style='width:154px'>snips:\tDATE&nbsp;&nbsp</span></label> said \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Hershcopf<span class='tooltip-text' style='width:301px'>spacy:\tORG&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>infrequent_proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>infrequent_proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>infrequent_nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tLOC&nbsp;&nbsp<br>core_web_md_truecase:\tLOC&nbsp;&nbsp<br>BTC:\tPER&nbsp;&nbsp<br>BTC_truecase:\tPER&nbsp;&nbsp<br>edited_BTC:\tPER&nbsp;&nbsp<br>edited_BTC_truecase:\tPER&nbsp;&nbsp<br>edited_core_web_md:\tLOC&nbsp;&nbsp<br>edited_core_web_md_truecase:\tLOC&nbsp;&nbsp<br>doc_history_cased:\tPER&nbsp;&nbsp<br>doc_history_uncased:\tPER&nbsp;&nbsp<br>doc_majority_cased:\tPER&nbsp;&nbsp<br>doc_majority_uncased:\tPER&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       ". When the company emerged from bankruptcy, it operated <label class='tooltip'>287<span class='tooltip-text' style='width:336px'>spacy:\tCARDINAL&nbsp;&nbsp<br>number_detector:\tCARDINAL&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp<br>core_web_md:\tCARDINAL&nbsp;&nbsp<br>core_web_md_truecase:\tCARDINAL&nbsp;&nbsp<br>edited_core_web_md:\tCARDINAL&nbsp;&nbsp<br>edited_core_web_md_truecase:\tCARDINAL&nbsp;&nbsp</span></label> stores in <label class='tooltip'>20<span class='tooltip-text' style='width:336px'>spacy:\tCARDINAL&nbsp;&nbsp<br>number_detector:\tCARDINAL&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp<br>core_web_md:\tCARDINAL&nbsp;&nbsp<br>core_web_md_truecase:\tCARDINAL&nbsp;&nbsp<br>edited_core_web_md:\tCARDINAL&nbsp;&nbsp<br>edited_core_web_md_truecase:\tCARDINAL&nbsp;&nbsp</span></label> states. The \n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Knoxville<span class='tooltip-text' style='width:301px'>spacy:\tLOC&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tLOC&nbsp;&nbsp<br>core_web_md_truecase:\tLOC&nbsp;&nbsp<br>BTC:\tLOC&nbsp;&nbsp<br>BTC_truecase:\tLOC&nbsp;&nbsp<br>edited_BTC:\tLOC&nbsp;&nbsp<br>edited_BTC_truecase:\tLOC&nbsp;&nbsp<br>edited_core_web_md:\tLOC&nbsp;&nbsp<br>edited_core_web_md_truecase:\tLOC&nbsp;&nbsp<br>geo_cased:\tLOC&nbsp;&nbsp<br>geo_uncased:\tLOC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       ", \n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Tennessee<span class='tooltip-text' style='width:301px'>spacy:\tLOC&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>core_web_md:\tLOC&nbsp;&nbsp<br>core_web_md_truecase:\tLOC&nbsp;&nbsp<br>BTC:\tLOC&nbsp;&nbsp<br>BTC_truecase:\tLOC&nbsp;&nbsp<br>edited_BTC:\tLOC&nbsp;&nbsp<br>edited_BTC_truecase:\tLOC&nbsp;&nbsp<br>edited_core_web_md:\tLOC&nbsp;&nbsp<br>edited_core_web_md_truecase:\tLOC&nbsp;&nbsp<br>wiki_cased:\tLOC&nbsp;&nbsp<br>wiki_uncased:\tLOC&nbsp;&nbsp<br>wiki_small_cased:\tLOC&nbsp;&nbsp<br>wiki_small_uncased:\tLOC&nbsp;&nbsp<br>geo_cased:\tLOC&nbsp;&nbsp<br>geo_uncased:\tLOC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       "-based <label class='tooltip'>retailer<span class='tooltip-text' style='width:238px'>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp</span></label> filed for \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>Chapter<span class='tooltip-text' style='width:308px'>gazetteer:\tORG&nbsp;&nbsp<br>spacy:\tMISC&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>legal_detector:\tMISC&nbsp;&nbsp<br>core_web_md:\tMISC&nbsp;&nbsp<br>core_web_md_truecase:\tMISC&nbsp;&nbsp<br>edited_core_web_md:\tMISC&nbsp;&nbsp<br>edited_core_web_md_truecase:\tMISC&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp</span></label> <label class='tooltip'>11<span class='tooltip-text' style='width:308px'>spacy:\tMISC&nbsp;&nbsp<br>legal_detector:\tMISC&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp<br>core_web_md:\tMISC&nbsp;&nbsp<br>core_web_md_truecase:\tMISC&nbsp;&nbsp<br>edited_core_web_md:\tMISC&nbsp;&nbsp<br>edited_core_web_md_truecase:\tMISC&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       " bankruptcy protection <label class='tooltip'>on<span class='tooltip-text' style='width:154px'>snips:\tDATE&nbsp;&nbsp</span></label> \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>June<span class='tooltip-text' style='width:308px'>spacy:\tDATE&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>snips:\tDATE&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp<br>core_web_md_truecase:\tDATE&nbsp;&nbsp<br>edited_core_web_md:\tDATE&nbsp;&nbsp<br>edited_core_web_md_truecase:\tDATE&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " <label class='tooltip'>9<span class='tooltip-text' style='width:308px'>spacy:\tDATE&nbsp;&nbsp<br>number_detector:\tCARDINAL&nbsp;&nbsp<br>snips:\tDATE&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp<br>core_web_md_truecase:\tDATE&nbsp;&nbsp<br>edited_core_web_md:\tDATE&nbsp;&nbsp<br>edited_core_web_md_truecase:\tDATE&nbsp;&nbsp</span></label>, hurt by high gasoline and food prices that have forced consumers to cut back on nonessential purchases. It emerged from bankruptcy protection <label class='tooltip'>in<span class='tooltip-text' style='width:154px'>snips:\tDATE&nbsp;&nbsp</span></label> \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    <label class='tooltip'>October<span class='tooltip-text' style='width:308px'>spacy:\tDATE&nbsp;&nbsp<br>proper_detector:\tENT&nbsp;&nbsp<br>proper2_detector:\tENT&nbsp;&nbsp<br>nnp_detector:\tENT&nbsp;&nbsp<br>snips:\tDATE&nbsp;&nbsp<br>core_web_md:\tDATE&nbsp;&nbsp<br>core_web_md_truecase:\tDATE&nbsp;&nbsp<br>edited_core_web_md:\tDATE&nbsp;&nbsp<br>edited_core_web_md_truecase:\tDATE&nbsp;&nbsp<br>crunchbase_cased:\tORG&nbsp;&nbsp<br>crunchbase_uncased:\tORG&nbsp;&nbsp<br>doc_majority_cased:\tORG&nbsp;&nbsp<br>doc_majority_uncased:\tORG&nbsp;&nbsp</span></label>\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " after cutting operating costs and closing <label class='tooltip'>at<span class='tooltip-text' style='width:336px'>spacy:\tCARDINAL&nbsp;&nbsp<br>core_web_md:\tCARDINAL&nbsp;&nbsp<br>core_web_md_truecase:\tCARDINAL&nbsp;&nbsp<br>edited_core_web_md:\tCARDINAL&nbsp;&nbsp<br>edited_core_web_md_truecase:\tCARDINAL&nbsp;&nbsp</span></label> <label class='tooltip'>least<span class='tooltip-text' style='width:336px'>spacy:\tCARDINAL&nbsp;&nbsp<br>core_web_md:\tCARDINAL&nbsp;&nbsp<br>core_web_md_truecase:\tCARDINAL&nbsp;&nbsp<br>edited_core_web_md:\tCARDINAL&nbsp;&nbsp<br>edited_core_web_md_truecase:\tCARDINAL&nbsp;&nbsp</span></label> <label class='tooltip'>69<span class='tooltip-text' style='width:336px'>spacy:\tCARDINAL&nbsp;&nbsp<br>number_detector:\tCARDINAL&nbsp;&nbsp<br>snips:\tCARDINAL&nbsp;&nbsp<br>core_web_md:\tCARDINAL&nbsp;&nbsp<br>core_web_md_truecase:\tCARDINAL&nbsp;&nbsp<br>edited_core_web_md:\tCARDINAL&nbsp;&nbsp<br>edited_core_web_md_truecase:\tCARDINAL&nbsp;&nbsp</span></label> underperforming stores. </div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "docs = list(unified_model.pipe(docs))\n",
    "skweak.utils.display_entities(docs[0], \"hmm\")\n",
    "skweak.utils.display_entities(docs[1], \"hmm\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "\n",
    "## __Step 3__: Development of neural NER model\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can now learn a neural NER model based on these unified annotations. We have two options: a straighforward (but slightly underperforming) approach using Spacy, and a more sophisticated approach using our own NER model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### __Alternative 1__: Using Spacy"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Write to ../../data/reuters_small.spacy...done\n"
     ]
    }
   ],
   "source": [
    "for doc in docs:\n",
    "    doc.ents = doc.spans[\"hmm\"]\n",
    "skweak.utils.docbin_writer(docs, \"../../data/reuters_small.spacy\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And we can then directly train a new NER model with Spacy's training regime:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2021-04-19 11:13:04.422733: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1\n",
      "2021-04-19 11:13:04.443580: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1\n",
      "\u001b[38;5;4mℹ Using CPU\u001b[0m\n",
      "\u001b[1m\n",
      "=========================== Initializing pipeline ===========================\u001b[0m\n",
      "[2021-04-19 11:13:11,975] [INFO] Set up nlp object from config\n",
      "[2021-04-19 11:13:11,985] [INFO] Pipeline: ['tok2vec', 'ner']\n",
      "[2021-04-19 11:13:11,989] [INFO] Created vocabulary\n",
      "[2021-04-19 11:13:14,790] [INFO] Added vectors: en_core_web_md\n",
      "[2021-04-19 11:13:14,790] [INFO] Finished initializing nlp object\n",
      "^C\n",
      "\n",
      "Aborted!\n"
     ]
    }
   ],
   "source": [
    "!spacy init config - --lang en --pipeline ner --optimize accuracy | \\\n",
    "spacy train - --paths.train ../../data/reuters_small.spacy  --paths.dev ../../data/reuters_small.spacy \\\n",
    "--initialize.vectors en_core_web_md --output ../../data/reuters_small\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We then load the learned model (taking the model version with best performance on the development set):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [],
   "source": [
    "nlp = spacy.load(\"../data/reuters_small_spacy/model-best\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And we finally run it on the document:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\"> \n",
       "<mark class=\"entity\" style=\"background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    ATLANTA\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">LOC</span>\n",
       "</mark>\n",
       " (\n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Reuters\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ") - \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Retailer Best Buy Co\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", seeking new ways to appeal to cost-conscious shoppers, said on Tuesday it is selling refurbished versions of \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple Inc's\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       " 3G at its stores that are priced about $50 less than new \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       ". The electronics chain said the used \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       ", which were returned within 30 days of purchase, are priced at $149 for the model with 8 gigabytes of storage, while the 16-gigabyte version is $249. A two-year service contract with \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T Inc\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " is required. \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    New iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " 3Gs currently sell for $199 and $299 at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy Mobile\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores. &quot;This is focusing on customers' needs, trying to provide as wide a range of products and networks for our consumers,&quot; said \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Scott Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       ", vice president of marketing for \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy Mobile\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ". Buyers of first-generation \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       " can also upgrade to the faster refurbished 3G models at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", he said. \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       " said \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", the exclusive wireless provider for the \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       ", offers refurbished \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       " online. The sale of used \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhones\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       " comes as \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Best Buy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", the top consumer electronics chain, seeks ways to fend off increased competition from discounters such as \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Wal-Mart Stores Inc\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ", which began selling the popular phone late last month. \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Wal-Mart\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " sells a new 8-gigabyte \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       " 3G for $197 and $297 for the 16-gigabyte model. The \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    iPhone\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">MISC</span>\n",
       "</mark>\n",
       " is also sold at \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores and \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    AT&amp;T\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " stores. \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Moore\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       " said Best Buy's move was not in response to other retailers' actions. (Reporting by \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Karen Jacobs\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       " ; Editing by \n",
       "<mark class=\"entity\" style=\"background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Andre Grenon\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">PER</span>\n",
       "</mark>\n",
       " )</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "skweak.utils.display_entities(nlp(news_text))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "NB: The file `eval_utils.py` contains code to easily extract evaluation metrics by comparing the annotations from a particular annotation layer (for instance the HMM predictions, or the predictions from a single source) to the gold standard:"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3.8.3 64-bit ('base': conda)",
   "name": "python383jvsc74a57bd0a07d3d471594300d0b4aada1f23fc7df0ddb592e079bf5cbd98e598cb7c91673"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
